Using Survival Analysis to Identify Populations of Learners at Risk of Withdrawal: Conceptualization and Impact of Demographics

Keywords: course withdrawal, demographics, distance education and online learning, dropout, intervention design, survival analysis


High dropout rates constitute a major concern for higher education institutions, due to their economic and academic impact. The problem is particularly relevant for institutions offering online courses, where withdrawal ratios are reported to be higher. Both the impact and these high rates motivate the implementation of interventions oriented to reduce course withdrawal and overall institutional dropout. In this paper, we address the identification of populations of learners at risk of withdrawing from higher education online courses. This identification is oriented to design interventions and is carried out using survival analysis. We demonstrate that the method’s longitudinal approach is particularly suited for this purpose and provides a clear view of risk differences among learner populations. Additionally, the method quantifies the impact of underlying factors, either alone or in combination. Our practical implementation used an open dataset provided by The Open University. It includes data from more than 30,000 students enrolled in different courses. We conclude that low-income students and those who report a disability comprise risk groups and are thus feasible intervention targets. The survival curves also reveal differences among courses and show the detrimental effect of early dropout on low-income students, worsened throughout the course for disabled students. Intervention strategies are proposed as a result of these findings. Extending the entire refund period and giving greater academic support to students who report disability are two proposed strategies for reducing course withdrawal.

Author Biographies

Juan Antonio Martínez-Carrascal, Universitat Oberta de Catalunya

Juan Antonio Martínez-Carrascal is a PhD candidate at the Universitat Oberta de Catalunya (UOC, Barcelona, Spain). He carries out his research within the Learning Analytics for Innovation and Knowledge Application in Higher Education (LAIKA) group. He focuses on the impact of technology in education and in particular on the influence of students’ characteristics and their activity on academic performance. He is also an associate professor at the Universitat Autònoma de Barcelona (UAB, Barcelona, Spain).

Martin Hlosta, Institute for Research in Open, Distance and eLearning, Swiss Distance University of Applied Sciences

Martin Hlosta is a research fellow at the Institute for Distance Learning and eLearning Research (IFeL), working on projects for adaptive learning in education. Before that, he led OUAnalyse at the Open University, a project improving student retention via machine learning, which was selected in 2020 by UNESCO as one of the four best projects using artificial intelligence in education.

Research focus: predictive learning analytics, learning analytics for equity in education, scaling up and impact of learning analytics.

Teresa Sancho-Vinuesa, Universitat Oberta de Catalunya

Teresa Sancho is full professor at Universitat Oberta de Catalunya (UOC) in Barcelona, Spain, where she teaches mathematics for engineers and conducts research on e-assessment, feedback and learning analytics as head of the LAIKA (Learning Analytics for Innovation and Knowledge Application in Higher Education) Group. Currently, she is the academic director of the Applied Data Science Degree. She was a visiting professor at the Open University UK (2015), University of Southampton (2018) and Cardiff University (2019).
Born in Barcelona, Spain, Dr. Sancho received the Mathematics degree from Universitat de Barcelona (UB) in 1990 and a Ph.D. degree in Electronic Engineering from Universitat Ramon Llull (URL) in 1995.
Dr. Sancho taught numerical analysis and the theory of probabilities and stochastic processes at the La Salle School of Engineering, where for six years she co-ordinated a research group on numerical methods to solve problems in fluid mechanics and electromagnetism. She was a member of the pedagogical and editorial team of the department of didactic material at Enciclopedia Catalana, S.A., Barcelona, Spain, before joining Universitat Oberta de Catalunya in 1998 where she has been involved in several positions: Academic coordinator of the Ph. D. Programme in Information and Knowledge Society, Research Director and Vicerector in Research and Innovation.
Teresa Sancho has been involved in several research and innovation projects concerning the Internet and Higher Education. In particular, she has been involved in the Catalonia Internet Project, an interdisciplinary research project on the information society in Catalonia, co-directed by professors Manuel Castells and Imma Tubella. She has been the coordinator of the MOOC Programme UCATx and CIRAX Programme, both launched by the Catalan Government in 2013. She is currently concentrating her research efforts in the use of learning analytics for the improvement of online education and learning, and more particularly in the evaluation and feedback processes related to mathematics subjects.
She has participated in over 15 technical programme committees and has been reviewer in several academic journals. Dr. Sancho has authored over 75 academic journal and conference papers, as well as writing two books and several chapters of books.


Aljohani, O. (2016). A comprehensive review of the major studies and theoretical models of student retention in higher education. Higher Education Studies, 6(2), 1–18.

Ameri, S., Fard, M. J., Chinnam, R. B., & Reddy, C. K. (2016). Survival analysis based framework for early prediction of student dropouts. In S. Mukhopadhyay & C. X. Zhai (Chairs), CIKM ’16: Proceedings of the International Conference on Information and Knowledge Management (pp. 903–912). ACM.

Bawa, P. (2016). Retention in online courses: Exploring issues and solutions—A literature review. SAGE Open, 6(1), 1–11.

Bean, J. P. (1985). Interaction effects based on class level in an explanatory model of college student dropout syndrome. American Educational Research Journal, 22(1), 35–64.

Behr, A., Giese, M., Teguim Kamdjou, H. D., & Theune, K. (2020). Dropping out of university: A literature review. Review of Education, 8(2), 614–652.

Berge, Z. L., & Huang, Y.-P. (2004). A model for sustainable student retention: A holistic perspective on the student dropout problem with special attention to e-Learning. Deosnews, 13(5), 1–26.

Bradburn, M. J., Clark, T. G., Love, S. B., & Altman, D. G. (2003a). Survival analysis part II: Multivariate data analysis—An introduction to concepts and methods. British Journal of Cancer, 89, 431–436.

Bradburn, M. J., Clark, T. G., Love, S. B., & Altman, D. G. (2003b). Survival analysis part III: Multivariate data analysis—Choosing a model and assessing its adequacy and fit. British Journal of Cancer, 89, 605–611.

Cabrera, A. F., Castañeda, M. B., Nora, A., & Hengstler, D. (1992). The convergence between two theories of college persistence. The Journal of Higher Education, 63(2), 143–164.

Clark, T. G., Bradburn, M. J., Love, S. B., & Altman, D. G. (2003a). Survival analysis part I: Basic concepts and first analyses. British Journal of Cancer, 89, 232–238.

Clark, T. G., Bradburn, M. J., Love, S. B., & Altman, D. G. (2003b). Survival analysis part IV: Further concepts and methods in survival analysis. British Journal of Cancer, 89, 781–786.

Cobre, J., Tortorelli, F. A. C., & de Oliveira, S. C. (2019). Modelling two types of heterogeneity in the analysis of student success. Journal of Applied Statistics, 46(14), 2527–2539.

Cochran, J. D., Campbell, S. M., Baker, H. M., & Leeds, E. M. (2014). The role of student characteristics in predicting retention in online courses. Research in Higher Education, 55(1), 27–48.

Emmert-Streib, F., & Dehmer, M. (2019). Introduction to survival analysis in practice. Machine Learning and Knowledge Extraction, 1(3), 1013–1038.

Grau-Valldosera, J., Minguillón, J., & Blasco-Moreno, A. (2019). Returning after taking a break in online distance higher education: From intention to effective re-enrollment. Interactive Learning Environments, 27(3), 307–323.

Hachey, A. C., Conway, K. M., Wladis, C., & Karim, S. (2022). Post-secondary online learning in the U.S.: An integrative review of the literature on undergraduate student characteristics. Journal of Computing in Higher Education.

Hachey, A. C., Wladis, C. W., & Conway, K. M. (2014). Do prior online course outcomes provide more information than G.P.A. alone in predicting subsequent online course grades and retention? An observational study at an urban community college. Computers and Education, 72, 59–67.

James, S., Swan, K., & Daston, C. (2016). Retention, progression and the taking of online courses. Journal of Asynchronous Learning Network, 20(2), 75–96.

Kuzilek, J., Hlosta, M., & Zdrahal,Z. (2017). Open University Learning Analytics dataset. Scientific Data, 4(1), 170171.

Larsen, M. S., Kornbeck, K. P., Kristensen, R. M., Larsen, M. R., & Sommersel, H. B. (2013). Dropout phenomena at universities: What is dropout? Why does dropout occur? What can be done by the universities to prevent or reduce it?: A systematic review. Danish Clearinghouse for Educational Research.

Lee, Y., & Choi, J. (2011). A review of online course dropout research: Implications for practice and future research. Educational Technology Research and Development, 59(5), 593–618.

Lee, Y., & Choi, J. (2013). Discriminating factors between completers of and dropouts from online learning courses. British Journal of Educational Technology, 44(2), 328–337.

Moreno-Marcos, P. M., Alario-Hoyos, C., Muñoz-Merino, P. J., & Kloos, C. D. (2019). Prediction in MOOCs: A review and future research directions. IEEE Transactions on Learning Technologies, 12(3), 384–401.

Muljana, P. S., & Luo, T. (2019). Factors contributing to student retention in online learning and recommended strategies for improvement: A systematic literature review. Journal of Information Technology Education: Research, 18, 19–57.

Open University, The. (2022). Changing your study plans policy 2022/23.

Peto, R., & Peto, J. (1972). Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society. Series A (General), 135(2), 185–207.

Radovanović, S., Delibašić, B., & Suknović, M. (2021). Predicting dropout in online learning environments. Computer Science and Information Systems, 18(3), 957–978.

Rienties, B., Boroowa, A., Cross, S., Farrington-Flint, L., Herodotou, C., Prescott, L., Mayles, K., Olney, T., Toetenel, L., & Woodthorpe, J. (2016). Reviewing three case-studies of learning analytics interventions at the Open University UK. In D. Gašević & G. Lynch (Chairs), LAK ’16: Proceedings of the sixth international conference on learning analytics and knowledge (pp. 534–535).

Rizvi, S., Rienties, B., & Khoja, S. A. (2019). The role of demographics in online learning: A decision tree based approach. Computers and Education, 137(2), 32–47.

Rizvi, S., Rienties, B., Rogaten, J., & Kizilcec, R. F. (2022). Beyond one-size-fits-all in MOOCs: Variation in learning design and persistence of learners in different cultural and socioeconomic contexts. Computers in Human Behavior, 126(C), Article 106973. https:/

Rovai, A. P. (2003). In search of higher persistence rates in distance education online programs. Internet and Higher Education, 6(1), 1–16.

Shah, M., & Cheng, M. (2019). Exploring factors impacting student engagement in open access courses. Open Learning, 34(2), 187–202.

Simpson, O. (2010). “22%—can we do better?”: The CWP retention literature review. Centre for Widening Participation, Open University UK.

Strang, K. D. (2017). Beyond engagement analytics: Which online mixed-data factors predict student learning outcomes? Education and Information Technologies, 22(3), 917–937.

Tinto, V. (1975). Dropout from higher education: A theoretical synthesis of recent research. Review of Educational Research, 45(1), 89–125.

Villano, R., Harrison, S., Lynch, G., & Chen, G. (2018). Linking early alert systems and student retention: A survival analysis approach. Higher Education, 76(4), 903–920.

Xavier, M., & Meneses, J. (2020). Dropout in online higher education: A scoping review from 2014 to 2018. Universitat Oberta de Catalunya.

Xing, W., Chen, X., Stein, J., & Marcinkowski, M. (2016). Temporal predication of dropouts in MOOCs: Reaching the low hanging fruit through stacking generalization. Computers in Human Behavior, 58, 119–129.

Xing, W., Tang, H., & Pei, B. (2019). Beyond positive and negative emotions: Looking into the role of achievement emotions in discussion forums of MOOCs. Internet and Higher Education, 43, Article 100690.

How to Cite
Martínez-Carrascal, J. A., Hlosta, M., & Sancho-Vinuesa, T. (2023). Using Survival Analysis to Identify Populations of Learners at Risk of Withdrawal: Conceptualization and Impact of Demographics. The International Review of Research in Open and Distributed Learning, 24(1), 1-21.
Research Articles