Development and evaluation of predictive models for predicting students performance in MOOCs

Ani, Anagha; Khor, Ean Teng

doi:10.1007/s10639-023-12398-w

Development and evaluation of predictive models for predicting students performance in MOOCs

Published: 28 December 2023

(2023)
Cite this article

Education and Information Technologies Aims and scope Submit manuscript

211 Accesses
8 Altmetric
1 Mention
Explore all metrics

Abstract

Predictive modelling in the education domain can be utilised to significantly improve teaching and learning experiences. Massive Open Online Courses (MOOCs) generate a large volume of data that can be exploited to predict and evaluate student performance based on various factors. This paper has two broad aims. Firstly, to develop and tune several Machine Learning (ML) models to perform classification tasks on the dataset to predict student performance, including Linear Regression, Logistic Regression, Random Forests, K-Nearest Neighbours, and more. Secondly, to evaluate the efficacy of these ML models and identify those which are best suited to this task. The categories of data utilised in achieving these aims include (i) demographic information, (ii) academic background, and (iii) interaction with MOOC course materials. The research procedure comprises five phases: data exploration to analyse the dataset, feature engineering which involves discerning the most important features and converting them into a format decipherable by the ML models, model building, model evaluation by measurement of accuracy, and subsequent comparative evaluation between the different models. The results achieved in this study are expected to have implications on how MOOC platforms utilise data to improve user experience. As indicated by the findings of this study, the data collected by these platforms may be used to predict performance with accuracy of over 77%; this extracted information can be exploited to enhance educational theory or practices in the context of MOOCs, for instance by implementing varying teaching methodologies or providing different types of resources based on predicted performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Narrative Review of Students’ Performance Factors for Learning Analytics Models

Educational Data Mining in Prediction of Students’ Learning Performance: A Sco** Review

A Survey of Machine Learning for Assessing and Estimating Student Performance

Data availability

The datasets generated and/or analysed for this study are available in the Open University Learning Analytics repository, https://analyse.kmi.open.ac.uk/open_dataset (“Open University Learning Analytics Dataset “, n.d.).

References

About the open university. (n.d.). About the open university. Retrieved 21 September 2023, from https://www.open.ac.uk/about/main/
Al Madhoun, W. (2020). Predictive modelling of student academic performance–the case of higher education in Middle East (Doctoral dissertation, University of East London). https://doi.org/10.15123/uel.88q0w
Bangash, M., Chaudhry, W., Rosales, L., Bilal, M., & Cui, L. (2022). A machine learning-based course enrollment recommender system.
Brownlee, J. (2020b, August 15). Linear discriminant analysis for machine learning. https://machinelearningmastery.com/linear-discriminant-analysis-for-machine-learning/
Brownlee, J. (2020a, June 30). Why one-hot encode data in machine learning? https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/
Ekowo, M., & Palmer, I. (2016). The promise and peril of predictive analytics in higher education: A landscape analysis. New America.
Frith, C. (1997). Motivation to learn. Educational Communications and technology, 2–11.
Harrison, O. (2019, July 14). Machine learning basics with the K-nearest neighbors algorithm. Medium. https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761
How Linear regression algorithm works—ArcGIS Pro | Documentation. (n.d.). Retrieved 21 September 2023, from https://pro.arcgis.com/en/pro-app/latest/tool-reference/geoai/how-linear-regression-works.htm
Ippolito, P. P. (2019, October 11). Feature extraction techniques. Medium. Retrieved September 21, 2023, from https://towardsdatascience.com/feature-extraction-techniques-d619b56e31be
Jia, P., & Maloney, T. (2014). Using predictive modelling to identify students at risk of poor university outcomes. Higher Education, 70(1), 127–149. https://doi.org/10.1007/s10734-014-9829-7
Article Google Scholar
Khor, E. T. (2022). A data mining approach using machine learning algorithms for early detection of low-performing students. International Journal of Information and Learning Technology, 39(2), 122–132. https://doi.org/10.1108/IJILT-09-2021-0144
Article MathSciNet Google Scholar
Kizilcec, R. F., Piech, C., & Schneider, E. (2013, April). Deconstructing disengagement: analyzing learner subpopulations in massive open online courses. In Proceedings of the third international conference on learning analytics and knowledge (pp. 170–179). https://doi.org/10.1145/2460296.246f0330
Kurzweil, M., & Wu, D. D. (2015). Building a pathway to student success at Georgia State University.
Littlejohn, A., Hood, N., Milligan, C., & Mustain, P. (2016). Learning in MOOCs: Motivations and self-regulated learning in MOOCs. The Internet and Higher Education, 29, 40–48.
Article Google Scholar
Madjarov, I., & Betari, A. (2008, December). Adaptive learning sequencing for course customization: A web service approach. In 2008 IEEE Asia-Pacific Services Computing Conference (pp. 530–535). https://doi.org/10.1109/APSCC.2008.297
Makombe, F., & Lall, M. (2020). A predictive model for the determination of academic performance in private higher education institutions. International Journal of Advanced Computer Science and Applications, 11(9). https://doi.org/10.14569/IJACSA.2020.0110949
Miguéis, V. L., Freitas, A., Garcia, P. J., & Silva, A. (2018). Early segmentation of students according to their academic performance: A predictive modelling approach. Decision Support Systems, 115, 36–51. https://doi.org/10.1016/j.dss.2018.09.001
Article Google Scholar
Mondal, P. (2013, August 22). 7 Important factors that may affect the learning process. Your Article Library. https://www.yourarticlelibrary.com/learning/7-important-factors-that-may-affect-the-learning-process/6064
Open University Learning Analytics Dataset. Open Learning Analytics | OU Analyse | Knowledge Media Institute | The Open University. (n.d.). Retrieved June 10, 2023, from https://analyse.kmi.open.ac.uk/open_dataset
Raj, A. (2020, October 5). Unlocking the true power of support vector regression. Medium. https://towardsdatascience.com/unlocking-the-true-power-of-support-vector-regression-847fd123a4a0
Raj, A. (2021, January 5). The perfect recipe for classification using logistic regression. Medium. https://towardsdatascience.com/the-perfect-recipe-for-classification-using-logistic-regression-f8648e267592
Romero, C., Ventura, S., & García, E. (2008). Data mining in course management systems: Moodle case study and tutorial. Computers & Education, 51(1), 368–384.
Article Google Scholar
Salem, R. O., Al-Mously, N., Nabil, N. M., Al-Zalabani, A. H., Al-Dhawi, A. F., & Al-Hamdan, N. (2013). Academic and socio-demographic factors influencing students’ performance in a new Saudi medical school. Medical Teacher, 35(sup1), S83–S89.
Article Google Scholar
Singh Chauhan, N. (2022, February 9). Decision tree algorithm, explained. KDnuggets. https://www.kdnuggets.com/2020/01/decision-tree-algorithm-explained.html#:~:text=The%20goal%20of%20using%20a,the%20root%20of%20the%20tree.
Singh, H. (2014, August 7). What’s wrong with MOOcs, and why aren’t they changing the game in education? Wired. https://www.wired.com/insights/2014/08/whats-wrong-moocs-arent-changing-game-education/
Talari, S. (2022, November 1). Random forest vs decision tree: Key differences. KDnuggets. https://www.kdnuggets.com/random-forest-vs-decision-tree-key-differences.html
Wang, Z., Zhu, C., Ying, Z., Zhang, Y., Wang, B., **, X., & Yang, H. (2018, November). Design and implementation of early warning system based on educational big data. In 2018 5th International Conference on Systems and Informatics (icsai) (pp. 549–553). https://doi.org/10.1109/ICSAI.2018.8599357
Xu, J., Moon, K. H., & van der Schaar, M. (2017). A machine learning approach for tracking and predicting student performance in degree programs. IEEE Journal of Selected Topics in Signal Processing, 11(5), 742–753. https://doi.org/10.1109/jstsp.2017.2692560
Article Google Scholar
Yiu, T. (2021, September 29). Understanding Random Forest. Medium. https://towardsdatascience.com/understanding-random-forest-58381e0602d2

Download references

Funding

None.

Author information

Authors and Affiliations

Nanyang Technological University, Singapore, Singapore
Anagha Ani
National Institute of Education, Nanyang Technological University, Singapore, Singapore
Ean Teng Khor

Authors

Anagha Ani
View author publications
You can also search for this author in PubMed Google Scholar
Ean Teng Khor
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualisation, K.E.T.; methodology, K.E.T. and A.A.; formal analysis, A.A. and K.E.T. Both authors prepared, edited, and approved the manuscript.

Corresponding author

Correspondence to Anagha Ani.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ani, A., Khor, E.T. Development and evaluation of predictive models for predicting students performance in MOOCs. Educ Inf Technol (2023). https://doi.org/10.1007/s10639-023-12398-w

Download citation

Received: 08 December 2022
Accepted: 01 December 2023
Published: 28 December 2023
DOI: https://doi.org/10.1007/s10639-023-12398-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development and evaluation of predictive models for predicting students performance in MOOCs

Abstract

Access this article

Similar content being viewed by others

A Narrative Review of Students’ Performance Factors for Learning Analytics Models

Educational Data Mining in Prediction of Students’ Learning Performance: A Sco** Review

A Survey of Machine Learning for Assessing and Estimating Student Performance

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Development and evaluation of predictive models for predicting students performance in MOOCs

Abstract

Access this article

Similar content being viewed by others

A Narrative Review of Students’ Performance Factors for Learning Analytics Models

Educational Data Mining in Prediction of Students’ Learning Performance: A Sco** Review

A Survey of Machine Learning for Assessing and Estimating Student Performance

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation