A scalable real-time computer vision system for student posture detection in smart classrooms

Huang, Jiawei; Zhou, Ding

doi:10.1007/s10639-023-12365-5

A scalable real-time computer vision system for student posture detection in smart classrooms

Published: 23 November 2023

Volume 29, pages 917–937, (2024)
Cite this article

Education and Information Technologies Aims and scope Submit manuscript

380 Accesses
Explore all metrics

Abstract

Technological advancements have ushered in a new era of global educational development. Artificial Intelligence (AI) holds the potential to enhance teaching effectiveness and foster educational innovation. By utilizing student posture as a proxy, computer vision technology can accurately gauge levels of student engagement. While previous efforts have focused on refining posture classification models, this study uniquely addresses the comprehensive implementation of a real-time posture detection workflow, encompassing software, hardware, and network aspects. The proposed posture detection system leverages surveillance cameras equipped with cutting-edge computer vision technology, specifically employing the Open Visual Inference & Neural Network Optimization (Open VINO) model for precise student posture detection. Data transmission is facilitated using the Message Queuing Telemetry Transport (MQTT) protocol, effectively establishing a seamless posture detection workflow within the classroom setting. To validate the system, video recordings from a real teaching environment (a fifth-grade class in the Chinese compulsory education system) were analyzed, resulting in posture classifications with impressive accuracies of 0.933 for standing, 0.772 for sitting, and 0.959 for hand-raising. Achieving a frame processing time ranging from 109 to 758 milliseconds, the system efficiently delivers real-time posture data to educators. Consequently, the posture detection system developed in this study possesses the capability to intelligently monitor student postures in the classroom, with the potential to enhance teaching quality in smart classrooms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classroom student posture recognition based on an improved high-resolution network

Article Open access 26 June 2021

Using RGBD cameras for classifying learning and teacher interaction through postural attitude

Article 15 March 2023

A visual intelligent system for students’ behavior classification using body pose and facial features in a smart classroom

Article 11 August 2023

Data availability

The classroom images supporting Figs. 3, 4 and 6, are not publicly available in order to protect student privacy. Data supporting Fig. 5; Tables 1, 2 and 3 are generated using this system.

References

Agahian, S., Negin, F., & Köse, C. (2019). Improving bag-of-poses with semi-temporal pose descriptors for skeleton-based action recognition. The Visual Computer, 35, 591–607. https://doi.org/10.1007/s00371-018-1489-7.
Article Google Scholar
Agahian, S., Negin, F., & Köse, C. (2020). An efficient human action recognition framework with pose-based spatiotemporal features. Engineering Science and Technology an International Journal, 23(1), 196–203. https://doi.org/10.1016/j.jestch.2019.04.014.
Article Google Scholar
Althloothi, S., Mahoor, M. H., Zhang, X., & Voyles, R. M. (2014). Human activity recognition using multi-features and multiple kernel learning. Pattern Recognition, 47(5), 1800–1812. https://doi.org/10.1016/j.patcog.2013.11.032.
Article Google Scholar
Böheim, R., Urdan, T., Knogler, M., & Seidel, T. (2020). Student hand-raising as an indicator of behavioral engagement and its role in classroom learning. Contemporary Educational Psychology, 62, Article 101894. https://doi.org/10.1016/j.cedpsych.2020.101894.
Cippitelli, E., Gasparrini, S., Gambi, E., & Spinsante, S. (2016). A human activity recognition system using skeleton data from RGBD sensors. Computational intelligence and neuroscience, 2016, Article 4351435. https://doi.org/10.1155/2016/4351435.
Corrin, L. (2021). Shifting to digital: A policy perspective on ‘Student perceptions of privacy principles for learning analytics’ (Ifenthaler & Schumacher 2016. Educational Technology Research and Development, 69(1), 353–356. https://doi.org/10.1007/s11423-020-09922-x.
Article Google Scholar
Franco, A., Magnani, A., & Maio, D. (2020). A multimodal approach for human activity recognition based on skeleton and RGB data. Pattern Recognition Letters, 131, 293–299. https://doi.org/10.1016/j.patrec.2020.01.010.
Article Google Scholar
Goda, K., & Mine, T. (2011). Analysis of students’ learning activities through quantifying time-series comments. Knowlege-Based and Intelligent Information and Engineering Systems: 15th International Conference KES 2011 Kaiserslautern Germany September 12–14 2011 Proceedings Part II, 15, 154–164. https://doi.org/10.1007/978-3-642-23863-5_16.
Article Google Scholar
Guddeti, R. M. R. (2020). Automatic detection of students’ affective states in classroom environment using hybrid convolutional neural networks. Education and Information Technologies, 25(2), 1387–1415. https://doi.org/10.1007/s10639-019-10004-6.
Article Google Scholar
Howell, J. A., Roberts, L. D., Seaman, K., & Gibson, D. C. (2018). Are we on our way to becoming a Helicopter University? Academics’ views on learning analytics. Technology Knowledge and Learning, 23(1), 1–20. https://doi.org/10.1007/s10758-017-9329-9.
Article Google Scholar
Hu, J., & Haiying, Z. (2021). Recognition of classroom student state features based on deep learning algorithms and machine learning. Journal of Intelligent & Fuzzy Systems, 40(2), 2361–2372. https://doi.org/10.3233/JIFS-189232.
Article Google Scholar
Jesna, J., Narayanan, A. S., & Bijlani, K. (2018). Automatic hand raise detection by analyzing the edge structures. Emerging Research in Computing, Information, Communication and Applications: ERCICA 2016, 171–180. https://doi.org/10.1007/978-981-10-4741-1_16.
Jia, J. G., Zhou, Y. F., Hao, X. W., Li, F., Desrosiers, C., & Zhang, C. M. (2020). Two-stream temporal convolutional networks for skeleton-based human action recognition. Journal of Computer Science and Technology, 35(3), 538–550. https://doi.org/10.1007/s11390-020-0405-6.
Article Google Scholar
Jiang, D., Chen, Y., & Garg, A. (2018). A hybrid method for overlap** speech detection in classroom environment. Computer Applications in Engineering Education, 26(1), 171–180. https://doi.org/10.1002/cae.21855.
Article Google Scholar
Keyvanpour, M. R., Vahidian, S., & Ramezani, M. (2020). HMR-vid: A comparative analytical survey on human motion recognition in video data. Multimedia Tools and Applications, 79(43), 31819–31863. https://doi.org/10.1007/s11042-020-09485-2.
Article Google Scholar
Lei, F., Wei, Y., Hu, J., Yao, H., Deng, W., & Lu, Y. (2019). Student action recognition based on multiple features. 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), 428–432. https://doi.org/10.1109/iThings/GreenCom/CPSCom/SmartData.2019.00091.
Leng, L., Leng, R., Ma, Z., Gong, Y., & Wei, T. (2022). An automated object detection method for the attention of classroom and conference participants. Third International Conference on Electronics and Communication; Network and Computer Technology (ECNCT 2021), 12167, 574–581. https://doi.org/10.1117/12.2628648. Article 121672B.
Article Google Scholar
Li, W., Jiang, F., & Shen, R. (2019). Sleep gesture detection in classroom monitor system. ICASSP 2019–2019 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 7640–7644. https://doi.org/10.1109/ICASSP.2019.8683116.
Liao, W., Xu, W., Kong, S., Ahmad, F., & Liu, W. (2019). A two-stage method for hand-raising gesture recognition in classroom. Proceedings of the 2019 8th International Conference on Educational and Information Technology, 38–44. https://doi.org/10.1145/3318396.3318437.
Liu, Y. (2021). Exploring machine vision application in public art education system based on image processor. Microprocessors and Microsystems, 80, Article 103630. https://doi.org/10.1016/j.micpro.2020.103630.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.
Liu, X., Wang, X., & Ren, C. (2019). Research on intelligent campus monitoring management system based on deep neural network algorithm. Journal of Physics: Conference Series, 1237(2), 022143. https://doi.org/10.1088/1742-6596/1237/2/022143.
Mazzoli, E., Teo, W. P., Salmon, J., Pesce, C., He, J., Ben-Soussan, T. D., & Barnett, L. M. (2019). Associations of class-time sitting, step** and sit-to-stand transitions with cognitive functions and brain activity in children. International Journal of Environmental Research and Public Health, 16(9), 1482. https://doi.org/10.3390/ijerph16091482.
Article Google Scholar
Mazzoli, E., Salmon, J., Teo, W. P., Pesce, C., He, J., Ben-Soussan, T. D., & Barnett, L. M. (2021). Breaking up classroom sitting time with cognitively engaging physical activity: Behavioural and brain responses. PLoS One, 16(7), e0253733. https://doi.org/10.1371/journal.pone.0253733.
Article Google Scholar
Meng, F., Cheng, H., Zhuang, J., Li, K., & Sun, X. (2021). RMNet: Equivalently removing residual connection from networks. ar**v preprint ar**v:2111.00687. https://doi.org/10.48550/ar**v.2111.00687.
Pabba, C., & Kumar, P. (2022). An intelligent system for monitoring students’ engagement in large classroom teaching through facial expression recognition. Expert Systems, 39(1), https://doi.org/10.1111/exsy.12839. Article e12839.
Pennings, H. J. M., Tartwijk, J., Wubbels, T., Claessens, L. C. A., Want, A.C. v. d., & Brekelmans, M. (2014). Real-time teacher–student interactions: A dynamic systems approach. Teaching and teacher education, 37, 183–193. https://doi.org/10.1016/j.tate.2013.07.016.
Rashmi, M., Ashwin, T. S., & Guddeti, R. M. R. (2021). Surveillance video analysis for student action recognition and localization inside computer laboratories of a smart campus. Multimedia Tools and Applications, 80(2), 2907–2929. https://doi.org/10.1007/s11042-020-09741-5.
Article Google Scholar
Roberts, L. D., Howell, J. A., Seaman, K., & Gibson, D. C. (2016). Student attitudes toward learning analytics in higher education: “The fitbit version of the learning world”. Frontiers in Psychology, 7, Article 1959. https://doi.org/10.3389/fpsyg.2016.01959.
Si, J., Lin, J., Jiang, F., & Shen, R. (2019). Hand-raising gesture detection in real classrooms using improved R-FCN. Neurocomputing, 359, 69–76. https://doi.org/10.1016/j.neucom.2019.05.031.
Article Google Scholar
Slade, S., & Prinsloo, P. (2013). Learning analytics: Ethical issues and dilemmas. American Behavioral Scientist, 57(10), 1510–1529. https://doi.org/10.1177/0002764213479366.
Article Google Scholar
Smith, K. C., Davoli, C. C. III, W. H. K., & Abrams, R. A. (2019). Standing enhances cognitive control and alters visual search. Attention Perception & Psychophysics, 81, 2320–2329. https://doi.org/10.3758/s13414-019-01723-6.
Article Google Scholar
Sophokleous, A., Christodoulou, P., Doitsidis, L., & Chatzichristofis, S. A. (2021). Computer vision meets educational robotics. Electronics, 10(6), https://doi.org/10.3390/electronics10060730. Article 730.
Sun, R. C. F., & Shek, D. T. L. (2012). Classroom misbehavior in the eyes of students: A qualitative study. The scientific world journal, 2012, Article 398482. https://doi.org/10.1100/2012/398482.
Tang, J., Zhou, X., & Zheng, J. (2019). Design of intelligent classroom facial recognition based on deep learning. Journal of Physics: Conference Series, 1168(2), Article 022043. https://doi.org/10.1088/1742-6596/1168/2/022043.
Thomas, C., & Jayagopi, D. B. (2017). Predicting student engagement in classrooms using facial behavioral cues. Proceedings of the 1st ACM SIGCHI international workshop on multimodal interaction for education, 33–40. https://doi.org/10.1145/3139513.3139514.
Toolkit, O. (2023). Model: person-detection-action-recognition-0005. https://docs.openvinotoolkit.org/latest/omz_models_intel_person_detection_action_recognition_0005_description_person_detection_action_recognition_0005.html.
Villiers, B. D., & Werner, A. (2016). The relationship between student engagement and academic success. Journal for New Generation Sciences, 14(1), 36–50. https://doi.org/https://hdl.handle.net/10520/EJC-6ce55e9d0.
Google Scholar
Wang, Z., Jiang, F., & Shen, R. (2019). An effective yawn behavior detection method in classroom. Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12–15, 2019, Proceedings, Part I, 11953, 430–441. https://doi.org/10.1007/978-3-030-36708-4_35.
Wang, J., Tan, S., Zhen, X., Xu, S., Zheng, F., He, Z., & Shao, L. (2021). Deep 3D human pose estimation: A review. Computer Vision and Image Understanding, 210, Article 103225. https://doi.org/10.1016/j.cviu.2021.103225.
Wang, R., Liu, R., Li, Y., & Wang, X. (2022). Learning enriched global context information for human pose estimation. Neural Processing Letters, 54(3), 1663–1678. https://doi.org/10.1007/s11063-021-10699-0.
Article Google Scholar
West, D., Huijser, H., & Heath, D. (2016). Putting an ethical lens on learning analytics. Educational Technology Research and Development, 64(5), 903–922. https://doi.org/10.1007/s11423-016-9464-3.
Article Google Scholar
Xue, E., & Li, J. (2021). Standardization of compulsory schooling in China: Politics, practices, challenges and suggestions. Educational Philosophy and Theory, 54(12), 2108–2120. https://doi.org/10.1080/00131857.2021.1986696.
Article Google Scholar
Yang, Y., & Guo, X. (2020). Universal basic education and the vulnerability to poverty: Evidence from compulsory education in rural China. Journal of the Asia Pacific Economy, 25(4), 611–633. https://doi.org/10.1080/13547860.2019.1699495.
Article Google Scholar
Yu, M., Xu, J., Zhong, J., Liu, W., & Cheng, W. (2017). Behavior detection and analysis for learning process in classroom environment. 2017 IEEE Frontiers in Education Conference (FIE), 1–4. https://doi.org/10.1109/FIE.2017.8190635.
Zaletelj, J., & Košir, A. (2017). Predicting students’ attention in the classroom from Kinect facial and body features. EURASIP Journal on Image and Video Processing, (1), 1–12. https://doi.org/10.1186/s13640-017-0228-8. Article 80.
Zhang, Z. (2012). Microsoft Kinect sensor and its effect. IEEE Multimedia, 19(2), 4–10. https://doi.org/10.1109/MMUL.2012.24.
Article Google Scholar
Zhang, Q., & Chen, Y. (2023). Spatial and contextual aware network based on multi-resolution for human pose estimation. The Visual Computer, 39(2), 651–662. https://doi.org/10.1007/s00371-021-02364-3.
Article Google Scholar
Zhang, X., & Rozelle, S. (2022). Education universalization, rural school participation, and population density. China & World Economy, 30(4), 4–30. https://doi.org/10.1111/cwe.12426.
Article Google Scholar
Zheng, R., Jiang, F., & Shen, R. (2020). Intelligent student behavior analysis system for real classrooms. ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 9244–9248. https://doi.org/10.1109/ICASSP40776.2020.9053457.
Zheng, R., Jiang, F., & Shen, R. (2021). GestureDet: Real-time student gesture analysis with multi-dimensional attention-based detector. Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 680–686. https://doi.org/https://dl.acm.org/doi/abs/https://doi.org/10.5555/3491440.3491535.
Zhou, H., Jiang, F., & Shen, R. (2018). Who are raising their hands? Hand-raiser seeking based on object detection and pose estimation. Asian Conference on Machine Learning, 95, 470–485. https://doi.org/https://proceedings.mlr.press/v95/zhou18a.html.

Download references

Acknowledgements

This work was financially supported by 2022 Research Projects of the Centre for Future Education Research at the Southern University of Science and Technology (FE22Z004). The authors would like to express their gratitude to EditSprings (https://www.editsprings.cn) for the expert linguistic services provided.

Author information

Authors and Affiliations

School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, Shenzhen, 518055, China
Jiawei Huang & Ding Zhou

Authors

Jiawei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ding Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ding Zhou.

Ethics declarations

Conflict of interest

No conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, J., Zhou, D. A scalable real-time computer vision system for student posture detection in smart classrooms. Educ Inf Technol 29, 917–937 (2024). https://doi.org/10.1007/s10639-023-12365-5

Download citation

Received: 31 March 2023
Accepted: 13 November 2023
Published: 23 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s10639-023-12365-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A scalable real-time computer vision system for student posture detection in smart classrooms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Classroom student posture recognition based on an improved high-resolution network

Using RGBD cameras for classifying learning and teacher interaction through postural attitude

A visual intelligent system for students’ behavior classification using body pose and facial features in a smart classroom

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A scalable real-time computer vision system for student posture detection in smart classrooms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Classroom student posture recognition based on an improved high-resolution network

Using RGBD cameras for classifying learning and teacher interaction through postural attitude

A visual intelligent system for students’ behavior classification using body pose and facial features in a smart classroom

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation