An Analysis of Course Characteristics, Learner Characteristics, and Certification Rates in MITx MOOCs

Massive Open Online Courses (MOOCs), capable of providing free (or low cost) courses for millions of learners anytime and anywhere, have gained the attention of researchers, educational institutions, and learners worldwide. Even though they provide several benefits, there are still some criticisms of MOOCs. For instance, MOOCs’ high dropout rates or predominantly elite participation are considered to be important problems. In order to develop solutions for these problems, a deeper understanding of MOOCs is required. Today, despite the availability of several research studies about MOOCs, there is a shortage of in-depth research on course characteristics, learner characteristics, and predictors of certification rates. This study examined MOOC and learner characteristics in detail and explored the predictors of course certification rates based on data from 122 Massachusetts Institute of Technology MOOCs (MITx) on edX platform as well as data about the 2.8 million participants registered in these MOOCs. The results indicated that as the number of courses offered and the number of learners enrolled increased in years, there was a decrease in the certification rates among enrolled learners. According to our results, the number of average chapters completed, total forum messages, and mean age predicted course certification rates positively. On the other hand, the total number of chapters in a course predicted the course certification rates negatively. Based on these results, shorter and more interactive MOOCs are recommended by considering the needs of the learners, course content design, and strategies encouraging the enrolled students to enter the courses.


Introduction
MOOCs have the potential to support traditional education activities both in-and out-of-class, such as homework and exercises, as well as individuals' lifelong learning. There are several benefits of MOOCs, as they are open-access and offered at little or no cost, with thousands of participants able to enroll and earn credits or receive certificates without constraints of space or time (De Barba, Kennedy, & Ainley, 2016;Porter, Graham, Spring, & Welch, 2014). MOOCs serve learners from all over the world, and there is no limit to learners' age, educational level, individual characteristics, or culture. Currently, more than 800 universities worldwide offer MOOCs and the number of these courses exceeds 10,000 (Shah, 2018b). Thus, the number of learners registered in MOOCs is huge compared to traditional courses. According to Shah (2018a) the top five MOOC providers (and number of registrations) are Coursera (37 million), edX (18 million), XuetangX (14 million), Udacity (10 million), and FutureLearn (8.7 million). This massive number of learners comprises people from diverse backgrounds with different motivations (DeBoer, Stump, Seaton, & Breslow, 2013;Kizilcec & Schneider, 2015). Deng, Benckendorff, and Gannaway (2019) have reported that MOOC learners' age distribution is mainly between 25 and 65 years. Even though MOOCs are open to everyone, the majority of learners hold higher education degrees, and most are male (Christensen et al., 2013).
MOOCs offer the possibility of providing free education for everybody; however, they have some limitations in terms of their, effectiveness, and benefits for both learners and educational organizations. Learners' behavior in MOOCs, instructional design of MOOCs, assessment processes, and interactions among learners and instructors are significantly different from traditional educational platforms. For instance, MOOC learners are rarely able to obtain direct and timely feedback from instructors (Kop, Fournier, & Mak, 2011). Furthermore, a study performed on four edX MOOCs reported that on average, certificate earners skipped 22% of the course content and made use of non-linear navigation (Guo & Reinecke, 2014). In the same study, it was also noted that older learners and those from lower learner-teacher ratio countries showed more comprehensive and non-linear navigation. Hence, MOOC learners present different behaviors than do learners in traditional courses. Another significant finding showed that less than 10% of the enrolled learners tended to complete their MOOC (Ho et al., 2015;Jordan, 2014). Providing course content in such a way so as to address these different individual requirements is a great challenge for MOOC instructional designers (Adair et al., 2014;Beaven, Codreanu, & Creuze, 2014). In addition, understanding the culture of learning in MOOCs is a complex process, and learners also face problems in adapting to these platforms . Compared to traditional education, learning outcomes from MOOCs, learners' purposes for enrolling in them (Watson et al., 2016), and their educational preferences are changing significantly (Watson, Watson, Yu, Alamri, & Mueller, 2017). A recent study reported that the same engagement measures may result in different achievement levels for different learner groups (Li & Baker, 2018). In other words, the instructional design of MOOCs is critical and essential (Yang, Shao, Liu, & Liu, 2017). Hone and El Said (2016) found that the MOOC content affected learners' retention and perceived effectiveness. Transactional interaction between learner and content, as well as the structure and assessment of course design factors are reported as significant predictors of learner control and sense of progress in MOOCs (Jung, Kim, Yoon, Park, & Oakley, 2019). Aparicio, Oliveira, Bacao, and Painho (2019) reported that gamification is contributing significantly to the overall success of MOOCs by reducing dropout rates and improving learner satisfaction and user experience. Therefore, because of its very nature, the instructional design of MOOCs also needs to incorporate new approaches, rather than traditional ones (Adair et al., 2014;Margaryan, Bianco, & Littlejohn, 2015;Rodriguez, 2012). Currently, MOOC quality is reported as suboptimal (Margaryan et al., 2015), while quality instructional design can improve learning outcomes in distance learning (Hsu & Shiue, 2005).
In recent years, several studies have been conducted to better understand MOOC learners (e.g., Cagiltay, Esfer, & Celik, 2020;Hew & Cheung, 2014;Khalil & Ebner, 2014). However, there have been a limited number of studies offering a bigger picture on MOOCs. Jordan (2015) analyzed 221 MOOCs from different providers and the results indicated decreased average total enrolments in these courses over time, but an increase in completion rates. Jordan proposed some significant predictors for course completion rates, which were reported as positively correlated with the start date and assessment type, and negatively correlated with the course length.  analyzed 17 Harvard University and MIT MOOCs offered in 2013, and reported a decrease from 3.2% to 2.5% in course certification for registered learners and an increase in the registration rates However, there is not sufficient evidence to confirm the results of these studies. In order to see the big picture, there is a need to analyze a larger number of courses, whereas previous studies only cover a limited number of MOOCs.
Accordingly, this study investigated 122 MITx courses with approximately 2.9 million learners, in order to provide feedback for MOOC developers on how to improve their courses. Specifically, this study analyzed the data provided by MITx, categorized these MOOCs into 15 course subjects classified into three course levels, and revealed the predictors of course certification rates. Although these MOOCs were from just one specific MOOC provider, they included heterogeneous learners from different backgrounds and countries.
Therefore, this study is not limited to a particular region or country and can be generalized globally to some degree. In this sense, this study provides a unique contribution to open and distributed learning.

Research Questions
The current study focused on the following six research questions: 1. How are the courses and number of enrolled learners distributed, in terms of subject areas? 2. How are the courses distributed in terms of course levels?
3. How is learner activity distributed in terms of subject areas and course levels? 4. How is learner activity and course certification distributed, in terms of course levels? 5. How is learner activity distributed in terms of specific course subjects?
6. What are the predictors of course certification rates?

Research Method
This quantitative study utilized descriptive and correlational research methods. Since it is difficult to confine educational events within controlled laboratory conditions, some types of educational research questions call for descriptions in order to explain the data (Knupfer & McLellan, 1996). The main focus of descriptive studies is to depict patterns rather than answer questions that ask why (Neuman, 2014), as they aim to describe and interpret what is happening (Cohen, Manion, & Morrison, 2007). In correlational research, the associations among variables are explored without any manipulation, and the variables can be used for prediction (Fraenkel & Wallen, 2009).

Data Collection and Analysis Process
The data were obtained from MITx MOOCs on the edX platform. In total, the available course data from 122 MITx MOOCs offered between 2012 and 2016 were obtained. The data provided by MITx was organized to represent details of each course. The level of each course was taken from the MITx website and combined with the course data. Then, the data were analyzed using descriptive statistics (mean, standard deviation, frequency, and percentage) and inferential statistics (multiple linear regression). Multiple linear regression (MLR) requires absence of outliers among the independent variables (predictor variables) and on the dependent variable (outcome variable), normality, linearity, and homoscedasticity of residuals, absence of multicollinearity, and independence of errors assumptions (Tabachnick & Fidell, 2007). Before carrying out an MLR analysis, its assumptions were checked. The data from four MOOCs, identified as outliers, were removed from the analysis. Due to the residuals not being completely normally distributed, a Box-Cox transformation was applied to the outcome variable, namely course certification rates. Thus, it was ensured that residuals were completely normally distributed. The homoscedasticity of residuals was evaluated by checking the scatterplot of the residuals, which showed no obvious pattern. Variance inflation factor (VIF) values were checked to determine whether there was multicollinearity between predictor variables. All the predictor variables had VIF values less than 3, ranging from 1.00 to 2.09, and no multicollinearity was detected. The Durbin-Watson value was checked for the independence of errors and found to be 1.81. To summarize, all assumptions were met for the multiple regression analysis. As a result, the MLR analysis was carried out with 118 MOOCs.

Research Question One
Most of the courses (22.13%, n = 27) were offered in the computer science subject area; accordingly, the highest number of learners were also registered in these courses (38.24%; n = 1,107,780), followed by engineering subjects (15.57%; n = 19) and business and management (14.75%; n = 18). The mean age of the learners enrolled in each course ranged from 28.27 to 32.49 years. Table 1 shows the numbers and percentages of the analyzed courses according to subject area.

Research Question Two
Most of the courses (45%) were at introductory level and 60% of the learners registered for these courses, followed by intermediate (31%) and advanced level (24%) courses. In all these course levels, the mean age of the learners was 30.03 years. Table 2 shows the number of courses and their percentages according to course level. Table 2 Number of Courses According to Course Level

Research Question Three
As shown in Table 3, about 50% to 60% of the learners viewed the course content only once and the remainder left the course after enrolling. Around 5% to 15% of the learners completed half of their course. Table 3 also reveals that the percentage of learners enrolled in education and teacher training courses was 4.52% and was lowest for chemistry courses (0.81%). In data analysis and statistics courses, 66.81% of the enrolled learners viewed the course one or more times. This rate was lowest for the math and physics courses (around 50%). It should be noted from Table 3 that the certificate rates were similar (on average 2%-3%) for the courses with higher or lower course effort (e.g., data analysis and statistics courses at 210 hours, and art and culture courses at 30 hours, respectively). The certificate rate was highest for history courses (4.70%) and lowest for math courses (1.25%).

Research Question Four
For those learners enrolled in an introductory level course, 59.92% viewed the course at least once (Table   4). This ratio was 58.05% for the intermediate and 56.01% for the advanced level courses. Of the learners enrolled in the introductory level courses, 9.09% viewed at least half of their course. This ratio was 11.18% and 10.94% for the intermediate and advanced course levels, respectively. Note. Percentage figures represent the percent of the total number of learners.
The completion rate for the advanced level courses was 2.62%, 3.60% for the intermediate, and 3.53% for the introductory level courses. The majority of the learners who completed their course also received a certificate. These findings show that there is a huge gap between enrollment and certification rates. Thus, it is quite possible to refer to those who register but never look at the course content as MOOC window shoppers.

Research Question Five
As shown in Table 5, the highest average number of forum messages was in the humanities courses (4,004.67) and lowest was in the education and teacher training courses (0.50). The highest number of total forum messages was in computer science courses (51,696) and lowest was in education and teacher training courses (3). However, when the percentages of forum messages were compared to the number of enrolled students, the result was highest in history (10.29%) and humanities (10.04%) courses, and lowest for education ad teacher training (0%) courses. Depending on the subject area, the average length of the courses ranged from 6 to 15 weeks, and the average weekly workload ranged between 5 and 14 hours. Course effort, which was calculated by multiplying the course length and average hours required per week, were between 30 hours and 210 hours. The highest total number of chapters was in the computer science courses (368) and the lowest was in the art and culture courses (10). The mean of the average completed chapters was highest in history courses (4.89) and lowest in humanities courses (2.31). Table 6 reveals that the highest number of forum messages pertained to the introductory courses (90,158) and the lowest to the advanced courses (15,096). There was a similar pattern for the total number of chapters and average chapters completed, which were higher in the introductory courses and lower in the advanced courses.

Research Question Six
The variables of (a) viewing the course once, (b) viewing half the course, (c) total forum messages, (d) total number of chapters, (e) average chapters completed, (f) length of the course in weeks, and (g) mean age of users were used to predict earned certification rates in MITx courses. The descriptive statistics regarding these variables are given in Table 7.  The average number of chapters completed, total forum messages, and mean age positively predicted the course certification rates. On the other hand, the total number of chapters predicted the course certification rates negatively. When the unique contributions of the predictors were examined to determine how each one explained the variance in course certification rates, the average number of chapters completed explained 31%, the total number of chapters explained 8%, the total number of forum messages explained 3%, and mean age explained 2% of the variance.

Course and Learner Characteristics
The purpose of this study was to provide a deeper understanding of the MITx MOOCs presented on the edX platform and the predictors of the course certification rates in these courses. Between 2012 and 2016, both the number of MITx courses and the number of enrolled learners increased. This is consistent with MOOCs being offered by more than 800 universities worldwide, and the number of MOOCs having exceeded 10,000 (Shah, 2018b). However, the current study found a decrease in the certification rates among the enrolled learners. These findings conflict with Jordan's (2015) comprehensive study that reported a decrease in enrollments over time and an increase in course completion percentages.
Based on the number of courses and the total registered learners, computer science as well as business and management courses were the most popular. According to Shah (2018b) technology courses (n = 17) dominated Class Central's list of the all-time, top 50 MOOCs; business courses (n = 6) were very popular as well. Moreover, the top 10 most popular courses on Coursera included a significant number of computer science and similar courses (Young, 2018), and the trend was similar in 2017, with cutting-edge tech skills being those most demanded in online education (Sinha, 2017). Thus, it can be inferred that MITx MOOCs are effective in satisfying individual's learning needs in both computer science and business and management courses.
While Cohen, Shimony, Nachmias, and Soffer (2019) reported that only 50% of registered learners actually started their course, among the courses analyzed in the current study, the rate was slightly higher, with 50% to 62% of learners starting their courses. However, the completion rates in this study were slightly lower compared to the earlier results reported as 4.6% (Pardos, Bergner, Seaton, & Pritchard, 2013) and 5.6% (Despujol, Turró, Castañeda, & Busquets, 2017) for edX courses, and 8% to 10% for MOOCs in general (Cohen et al., 2019;Jordan, 2014). In addition, completion rates were lower for both gender groups in the advanced level courses as compared to the introductory level courses. The results of the current study support the findings in the relevant literature. In general, course completion rates were lower in the MOOCs; however, in this study, it was shown that once learners were able to complete their course, it was most likely they would also receive certificates. Accordingly, when course completion rates improved, certification rates also improved.
In general, the majority of the enrolled learners had bachelor's or master's degrees, or both. Enrolled learners with a middle or high school diploma (equivalent to secondary, high school, or junior secondary/high/middle school) was 24%. Learners having completed primary or elementary school or who had no formal education accounted for 10% of those enrolled in the courses, as small minority of enrolled learners. These results parallel those in earlier studies, indicating that the majority of MOOC learners had some educational background including a higher degree (Bayeck, 2016;Christensen et al., 2013;Macleod, Haywood, Woodgate, & Alkhatnai, 2015). Furthermore, in the intermediate and advanced level courses, the percentages of learners having bachelor's degree, master's degree, or a doctorate, were higher than in the introductory courses. On the other hand, in the introductory level courses, the percentage of learners having an associate degree and secondary or high school diploma were higher than the participants in the intermediate and advanced level courses. The results also indicated that the rates of completing at least half of the course increased with higher education levels. Based on these results, MOOCs benefit the educated population, but they have not yet satisfied initial expectations because they do not serve the potential needs of those from lower educational levels.
Regarding course levels, the (a) total number of courses, (b) number of enrolled learners, (c) course length, course that provides basic, introductory information could be added to existing introductory level courses.
Thus, learners with lower education levels could be supported by MOOCs, for a variety of purposes, such as returning to school and preparing for college.

Predictors of Certification Rates
This study revealed that the average number of chapters completed, total number of forum messages, and mean age positively predicted course certification rates. On the other hand, the total number of chapters in a course negatively predicted course certification rates. Overall, the results of the current study support earlier results in the literature. When the average number of chapters completed increased, a learner was closer to completing the course. Consistent with this finding, Hone and Said (2016) commented that the MOOC learners who passed the midpoint of a course were likely to complete it.
Similar to our results, previous research reported that viewing online forums and participating in online discussions were significant influencing factors for predicting MOOC completion rates (Bonafini, 2017;Goldwasser, Mankoff, Manturuk, Schmid, & Whitfield, 2016). Social interactions in online groups are crucial for successful learning (Barak, Watted, & Haick, 2016). Furthermore, engaging with forum comments is related to greater commitment to the course materials (Ferguson & Clow, 2015). MOOC forums can have different functions; for example, in MOOC forums, learners can socialize with their peers, while also asking questions about the course material and exams. In this way, forums take on the traditional classroom role of offering assistance during office hours or talking to and helping a classmate understand a challenging subject (Diver & Martinez, 2015). However, it is important to mention that discussions might be problematic in MOOCs due to the massive number of participants. Regarding age, in a study conducted with 33,938 learners in a MOOC offered in 2013, Greene, Oswald, and Pomerantz (2015) found that older participants were less likely to leave a MOOC. A similar finding was reported by Morris, Hotchkiss, and Swinnerton (2015), who indicated that older learners were more likely to complete a MOOC. Concerning the length of MOOCs, Jordan (2014) revealed that completion rates were negatively associated with course length. However, in this study, course length was not significant in predicting course certification rates, but the total number of chapters was. As the chapters in MITx courses are also aligned with the course sections, they are related to the volume of the course, and thus indirectly related to the course length. The associations mentioned above were observed in the courses investigated in this study. For example, computer science, business and management, and history courses can be seen as more successful in terms of course completion and certification rates. According to the number of forum messages, learners in these courses were also more active. However, the chemistry courses could be evaluated as less successful considering their course completion rates and less interaction in terms of the number of forum messages.
Accordingly, it can be concluded that in some courses, the level of interaction and success was higher than others. There could be different reasons for this (e.g., instructional design features of the courses, how the content is represented, usability features of the courses). Jung et al. (2019) provided evidence that instructional design factors were more often linked to successful MOOC experiences, compared to content and demographic factors, and that instructional components were significant predictors of learning in MOOCs. According to Deshpande and Chukhlomin (2017), content, accessibility, and interactivity influenced participants' motivation to learn. Furthermore, Junjie (2017) found that knowledge outcome was a strong predictor of continuance intention of MOOC learners. Furthermore, learners' attitudes, motivations, and backgrounds could also be influencing factors. Hence, these factors need to be analyzed in depth to improve the interaction level and success of MOOCs.

Implications
Several recommendations for MOOCs regarding course length, course activities, learner needs, course content design, and learner motivation can be represented in the acronym SINCE: Shorter and more Interactive courses that consider learners' Needs, as well as Course design and strategies that encourage enrolled students to Enter their course. These recommendations, explained in detail below, might be useful for people who are interested in designing and developing MOOCs.
Shorter courses. Certification rates increase when the number of chapters learners complete and the number of forum messages learners post also increase. On the other hand, certification rates decrease when the number of chapters in a course increases. Therefore, courses should be divided into smaller sections while also keeping the total number of chapters in the course low, so that courses can be designed to be an easily digestible size. Similarly, as the number of learners who completed half of their course is high, perhaps the optimal length for a course is half as long.
Interactive courses. As learners enter the course for the first time, they should encounter interactive components, such as forums or activities, to boost learner motivation and retain learners in the course; MOOC learners should also be rewarded for the chapters they complete. In addition, since the number of forum messages predicted certification rates positively, forum activities should be promoted and course discussions should be carefully designed.
Needs. Learner profiles (e.g., education level, gender) as well as their needs and expectations should be analyzed carefully in relation to the offered MOOCs. Course content and design should be updated regularly, taking these needs into consideration.

Content of courses.
Completion rates are lower in some course subjects, such as chemistry, while higher in others, such as business and management. Course content should be critically analyzed in terms of its appropriateness for the MOOC format. This issue should also be further researched in order to find more effective ways of developing and designing MOOCs.
Entering courses. Since most of the enrolled learners do not even enter the course, such window shoppers should be encouraged to enter the course through use of motivational strategies. Regular reminders to enter the course could be sent to learners. Moreover, as in the design of a shopping mall, participant attraction strategies could be employed and new course marketing approaches could be tested and evaluated.

Limitations and Further Research
This study has a number of limitations. First, it was limited to 122 MITx courses offered between 2012 and 2016. As well, the current study was based on the log data kept by MITx only, so the scope of this study was limited to the content of these log data.
Future research studies should consider using multiple data sources, as well as merging the log data with learners' self-reported data in order to get a more comprehensive view of course and learner characteristics, and how these influence course certification rates. Future work could also include and compare other MOOC portals with regards to course characteristics, learner characteristics, and certification rates. Third, the recommendations this study yielded for MOOCs regarding course length, course activities, learner needs, course content design, and learner motivation could be empirically tested in future studies. Finally, research could explore ways to motivate learners to enter their course, as many learners who register for a MOOC never attempt to start the course.