In this study, we describe the results of our research to model collaborative problem-solving (CPS) competence based on analytics generated from video data. We have collected similar to 500 mins video data from 15 groups of 3 students working to solve design problems collaboratively. Initially, with the help of OpenPose, we automatically generated frequency metrics such as the number of the face-in-the-screen; and distance metrics such as the distance between bodies. Based on these metrics, we built decision trees to predict students' listening, watching, making, and speaking behaviours as well as predicting the students' CPS competence. Our results provide useful decision rules mined from analytics of video data which can be used to inform teacher dashboards. Although, the accuracy and recall values of the models built are inferior to previous machine learning work that utilizes multimodal data, the transparent nature of the decision trees provides opportunities for explainable analytics for teachers and learners. This can lead to more agency of teachers and learners, therefore can lead to easier adoption. We conclude the paper with a discussion on the value and limitations of our approach.