Using Few-Shot Learning Materials of Multiple SPOCs to Develop Early Warning Systems to Detect Students at Risk
DOI:
https://doi.org/10.19173/irrodl.v22i4.5397Keywords:
precision education, SPOC, early warning system, portability of prediction model, LMSAbstract
Early warning systems (EWSs) have been successfully used in online classes, especially in massive open online courses, where it is nearly impossible for students to interact face-to-face with their teachers. Although teachers in higher education institutions typically have smaller class sizes, they also face the challenge of being unable to have direct contact with their students during distance teaching. In this research, we examined the online learning trajectories of students participating in four small private online courses that were all taught by one teacher. We collected relevant data of 1,307 students from the campus learning management system. Subsequently, we constructed 18 prediction models, one for each week of the course, to develop an EWS for identifying students in online asynchronous learning at risk of failing (i.e., students who fail their final examination). Our results indicated that the fifth-week model successfully predicted student performance, with an accuracy exceeding 83% from the eighth week onward.
References
Baker, R. S. J. D., Costa, E., Amorim, L., Magalhães, J., & Marinho, T. (2012). Mineração de ados ducacionais: Conceitos, écnicas, erramentas e plicações (Educational data mining: Concepts, techniques, tools and applications. Updating day in informatics in education). Jornada de Atualização em Informática na Educação, 1, 1–29.
Baker, R. S., Lindrum, D., Lindrum, M. J., & Perkowski, D. (2015). Analyzing early at-risk factors in higher education e-learning courses. In J. G. Boticario, O. C. Santos, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.), Proceedings of the 8th International Conference on Educational Data Mining (pp. 150–155). http://www.columbia.edu/~rsb2162/2015paper41.pdf
Baradwaj, B. K., & Pal, S. (2011). Mining educational data to analyze students performance. International Journal of Advanced Computer Science and Applications, 2(6), 63–69. http://dx.doi.org/10.14569/IJACSA.2011.020609
Barry, M., & Reschly, A. (2012). Longitudinal predictors of high school completion. School Psychology Quarterly, 27(2), 74–84. https://doi.org/10.1037/a0029189
Bawa, P. (2016). Retention in online courses: Exploring issues and solutions—A literature review. SAGE Open, 6(1), 1–11. https://doi.org/10.1177/2158244015621777
Bowles, M. (2015). Machine learning in Python—Essential techniques for predictive analysis. Wiley.
Cerezo, R., Sánchez-Santillán, M., Paule-Ruiz, M. P., & Núñez, J. C. (2016). Students’ LMS interaction patterns and their relationship with achievement: A case study in higher education. Computers & Education, 96, 42–54. https://doi.org/10.1016/j.compedu.2016.02.006
Chen, W., Brinton, C. G., Cao, D., Mason-Singh, A., Lu, C., & Chiang, M. (2018). Early detection prediction of learning outcomes in online short-courses via learning behaviors. IEEE Transactions on Learning Technologies, 12(1), 44–58. http://dx.doi.org/10.1109/TLT.2018.2793193
Conijn, R., Kleingeld, A., Matzat, U., Snijders, C., & van Zaanen, M. (2016, December 10). Influence of course characteristics, student characteristics, and behavior in learning management systems on student performance [Paper presentation]. Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Education 2016, Barcelona, Spain.
Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2017). Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29. https://doi.org/10.1109/TLT.2016.2616312
Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., & Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in Human Behavior, 73, 247–256. https://doi.org/10.1016/j.chb.2017.01.047
Francis, B. K., & Babu, S. S. (2019). Predicting academic performance of students using a hybrid data mining approach. Journal of Medical Systems, 43(6), 162.
Gašević, D., Dawson, S., Rogers, T., & Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84. https://doi.org/10.1016/j.iheduc.2015.10.002
Goutte C., & Gaussier E. (2005). A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In D. E. Losada & J. M. Fernández-Luna (Eds.), Lecture Notes in Computer Science: Vol. 3408. Advances in Information Retrieval. ECIR 2005 (pp. 345-359). https://doi.org/10.1007/978-3-540-31865-1_25
Hanley, J. A. & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29–36. https://doi.org/10.1148/radiology.143.1.7063747
He, H., & Garcia, E. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. https://doi.org/10.1109/TKDE.2008.239
He, J., Bailey, J., Rubinstein, B. I. P., & Zhang, R. (2015). Identifying at-risk students in massive open online courses. In B. Bonet & S. Koenig (Eds.), Proceedings of the twenty-ninth AAAI conference on artificial intelligence (pp. 1749–1755). AAAI. https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9696/9460
Howard, E., Meehan, M., & Parnell, A. (2018). Contrasting prediction methods for early warning systems at undergraduate level. The Internet and Higher Education, 37, 66–75. https://doi.org/10.1016/j.iheduc.2018.02.001
Hu, Y.-H., Lo, C.-L., & Shih, S.-P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior, 36, 469–478. https://doi.org/10.1016/j.chb.2014.04.002
Hussain, M., Zhu, W., Zhang, W., & Abidi, S. M. R. (2018). Student engagement predictions in an e-learning system and their impact on student course assessment scores. Computational Intelligence and Neuroscience, 2018, Article 6347186, 1–21. https://doi.org/10.1155/2018/6347186
Jayaprakash, S. M., Moody, E. W., Lauría, E. J., Regan, J. R., & Baron, J. D. (2014). Early alert of academically at-risk students: An open source analytics initiative. Journal of Learning Analytics, 1(1), 6–47. https://doi.org/10.18608/jla.2014.11.3
Jokhan, A., Sharma, B., & Singh, S. (2018). Early warning system as a predictor for student performance in higher education blended courses. Studies in Higher Education, 44(11), 1900–1911. https://doi.org/10.1080/03075079.2018.1466872
Kim, B., Vizitei, E., & Ganapathi, V. (2018). Gritnet: Student performance prediction with deep learning. In K. E. Boyer & M. Yudelson, (Eds.), Proceedings of the 11th international conference on educational data mining. International Educational Data Mining Society. https://arxiv.org/abs/1804.07405
Kotsiantis, S. B. (2011). Use of machine learning techniques for educational proposes: A decision support system for forecasting students’ grades. Artificial Intelligence Review, 37(4), 331–344. https://doi.org/10.1007/s10462-011-9234-x
Kotsiantis, S. B., Pierrakeas, C., & Pintelas, P. (2004). Predicting students’ performance in distance learning using machine learning techniques. Applied Artificial Intelligence, 18(5), 411–426. https://doi.org/10.1080/08839510490442058
Lauría, E. J. M., Baron, J. D., Devireddy, M., Sundararaju, V., & Jayaprakash, S. M. (2012). Mining academic data to improve college student retention. In S. Dawson, C. Haythornthwaite, S. B. Shum, D. Gašević, & R. Ferguson (Eds.), LAK ’12: Proceedings of the 2nd international conference on learning analytics and knowledge (pp. 139–142). ACM. https://doi.org/10.1145/2330601.2330637
Li, X., Wang, T., & Wang, H. (2017). Exploring N-gram features in clickstream data for MOOC learning achievement prediction. In Z. Bao, G. Trajcevski, L. Chang, & W. Hua (Eds.), Lecture Notes in Computer Science: Vol. 10179. Database Systems for Advanced Applications. DASFAA 2017 (pp. 328–339). https://doi.org/10.1007/978-3-319-55705-2_26
Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588–599. https://doi.org/10.1016/j.compedu.2009.09.008
Milne, J., Jeffrey, L. M., Suddaby, G., & Higgins, A. (2012). Early identification of students at risk of failing. In M. Brown, M. Harnett, & T. Stewart (Eds.), Future challenges, sustainable futures: Proceedings ASCILITE (pp. 657–661). http://www.ascilite.org/conferences/Wellington12/2012/images/custom/milne,_john_-_early_identification.pdf
Morris, L. V., Finnegan, C., & Wu, S.-S. (2005). Tracking student behavior persistence and achievement in online courses. The Internet and Higher Education, 8(3), 221-231. https://doi.org/10.1016/j.iheduc.2005.06.009
Okubo, F., Yamashita, T., Shimada, A., & Ogata, H. (2017). A neural network approach for students’ performance prediction. In A. Wise, P. H. Winne, G. Lynch, X. Ochoa, I. Molenaar, S. Dawson, & M. Hatala (Eds.), LAK ’17: Proceedings of the 7th international learning analytics and knowledge conference (pp. 598–599). ACM. https://doi.org/10.1145/3027385.3029479
Papamitsiou, Z., & Economides, A. A. (2014). Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence. Journal of Educational Technology & Society, 17(4), 49–64. https://www.jstor.org/stable/pdf/jeductechsoci.17.4.49
Ramaswami, M., & Bhaskaran, R.(2009). A study on feature selection techniques in educational data mining. Journal of Computing, 1(1), 7–11. https://arxiv.org/abs/0912.3924
Rienties, B., Toetenel, L., & Bryan, A. (2015). ‘Scaling up’ learning design: Impact of learning design activities on LMS behavior and performance. In J. Baron, G. Lynch, N. Maziarz, P. Blikstein, A. Merceron, & G. Siemens (Eds.), LAK ’15: Proceedings of the 5th international conference on learning analytics and knowledge (pp. 315–319). https://doi.org/10.1145/2723576.2723600
Rodrigues, R. L., Ramos, J. L. C., Sedraz, J., & Gomes, A. S. (2016). Discovery engagement patterns MOOCs through cluster analysis. IEEE Latin America Transactions, 14(9), 4129–4135. https://doi.org/10.1109/TLA.2016.7785943
Romero, C., & Ventura, S. (2013). Data mining in education. WIREs, 3(1), 12–27. https://doi.org/10.1002/widm.1075
Romero, C., Ventura, S., & García, E. (2008). Data mining in course management systems: Moodle case study and tutorial. Computers & Education, 51(1), 368–384. https://doi.org/10.1016/j.compedu.2007.05.016
Saa, A. A., Al-Emran, M., & Shaalan, K. (2019). Factors affecting students’ performance in higher education: A systematic review of predictive data mining techniques. Technology, Knowledge and Learning, 24, 567–598.
Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10(3), 1–21. https://doi.org/10.1371/journal.pone.0118432
Sana, B., Siddiqui, I. F., & Arain, Q. A. (2019). Analyzing students’ academic performance through educational data mining. 3C Tecnología. Glosas de innovación aplicadas a la pyme, 29(2), 402–421. http://dx.doi.org/10.17993/3ctecno.2019.specialissue2.402-421
Tempelaar, D. T., Rienties, B., & Giesbers, B. (2015). In search for the most informative data for feedback generation: Learning analytics in a data-rich context. Computers in Human Behavior, 47, 157–167. https://doi.org/10.1016/j.chb.2014.05.038
Toraman, S., Tuncer, S. A., & Balgetir, F.(2019). Is it possible to detect cerebral dominance via EEG signals by using deep learning? Medical Hypotheses, 131, Article 109315. https://doi.org/10.1016/j.mehy.2019.109315
Wolff, A., Zdrahal, Z., Herrmannova, D., Kuzilek, J., & Hlosta, M. (2014). Developing predictive models for early detection of at-risk students on distance learning modules. In M. Pistilli, J. Willis, D. Koch, K. Arnold, S. Teasley, & A. Pardo (Eds.), LAK ’14: Proceedings of 4th international learning analytics and knowledge conference (pp. 24–28). http://oro.open.ac.uk/40669/
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License. The copyright for all content published in IRRODL remains with the authors.
This copyright agreement and usage license ensure that the article is distributed as widely as possible and can be included in any scientific or scholarly archive.
You are free to
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms below:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.