MOOC Evaluation System Based on Deep Learning

Massive open online courses (MOOCs) are open access, Web-based courses that enroll thousands of students. MOOCs deliver content through recorded video lectures, online readings, assessments, and both student–student and student–instructor interactions. Course designers have attempted to evaluate the experiences of MOOC participants, though due to large class sizes, have had difficulty tracking and analyzing the online actions and interactions of students. Within the broader context of the discourse surrounding big data, educational providers are increasingly collecting, analyzing, and utilizing student information. Additionally, big data and artificial intelligence (AI) technology have been applied to better understand students’ learning processes. Questionnaire response rates are also too low for MOOCs to be credibly evaluated. This study explored the use of deep learning techniques to assess MOOC student experiences. We analyzed students’ learning behavior and constructed a deep learning model that predicted student course satisfaction scores. The results indicated that this approach yielded reliable predictions. In conclusion, our system can accurately predict student satisfaction even when questionnaire response rates are low. Accordingly, teachers could use this system to better understand student satisfaction both during and after the course.


Introduction
Massive open online courses (MOOCs) are open-access educational resources that offer various academic courses to the general public through the Internet (Kop, 2011). Since 2012, MOOCs have included high-quality video lectures from universities worldwide. The self-directed learning environment provided by MOOCs signifies a modern approach to education. Users of MOOCs can learn not only from instructional videos created by professors but also through other methods suited to their individual learning styles, including live-streaming video lectures, efficient assessments, and discussion forums (McAuley et al., 2010).
A considerable amount of learning data can be collected and analyzed from the increasingly large number of MOOC users. Many studies have been conducted based on MOOC data; for instance, Kop et al. (2011) described the use of Facebook groups by MOOC participants and obtained data from learner experience surveys, participant demographics, and learner progression through courses. Adamopoulos (2013) analyzed a dataset of MOOC user-generated content to identify factors that predicted selfreported course progress.
Within the broader context of the discourse surrounding big data, educational providers are increasingly collecting, analyzing, and using student information (Papamitsiou & Economides, 2014;Su & Lai, 2021;Su & Wu, 2021). Data have been collected to personalize learning experiences and allocate resources to individual students (Gašević et al., 2015;Leitner et al., 2017).
Additionally, big data and artificial intelligence (AI) technology have been applied to better understand learning. Researchers initially focused on creating personalized teaching systems for lone learners, but recent studies have emphasized the interactions between students and the learning material (Kay, 2012).
Cognitive science can be used to help lecturers understand the nature of learning and teaching. Thus, the findings can be used to build better systems to help learners gain new skills or understand new concepts. AI has now begun to affect the student experience through analyses of learning data (du Boulay, 2016).
Learner satisfaction refers to student perceptions of both the learning experience and the value of the education received (Baxter Magolda, 1993). According to Donohue and Wong (1997), satisfaction can affect student motivation. It is a significant intermediate outcome (Donohue & Wong, 1997) and a predictor of retention (Baxter Magolda, 1993). Bean and Bradley (1986) found that for college students, satisfaction had a greater impact on their performance than performance had on their satisfaction. However, Klobas et al. (2014) stated that researchers know very little about learner motivations, experiences, and satisfaction. Veletsianos (2013) also noted that discussions about new educational innovations, such as MOOCs, lack input from learners. Accordingly, it is reasonable to conclude that student satisfaction, as determined by student feedback, is a critical factor influencing academic success.
Some studies, such as Liu et al. (2014) and Onah et al. (2014) have characterized MOOC student perspectives by investigating what they learned, the aspects of MOOCs they found most useful, and their motivations for enrolling in MOOCs. However, these studies have been limited to surveying enrollees in journalism MOOCs or analyzing blog posts written by MOOC students related to their MOOC experiences.
Researchers have tried to understand the high dropout rate of MOOCs. (Magold, 1993). Onah et al. (2014) postulated several reasons of low dropout rate, such as low motivation to complete the courses, lack of time, digital and learning skills, and level of the course and lack of support. Information collected by researchers and e-learning providers has come primarily in the form of big data or learning analytics gathered from observations of online student interactions with the instructors, the content, and their classmates. However, this approach has proved insufficient for gaining a comprehensive understanding of learner experiences in open online learning.
Studies investigating MOOCs from the perspective of an individual learner have collected data from learner experience surveys and on (a) participant demographics; (b) learner progression throughout various courses (in terms of, for example, the number of videos viewed or tests taken; Kop et al., 2011); (c) class size and completion rate (Adamopoulos, 2013); or (d) students' behaviors, motivations, and communication patterns (Swinnerton et al., 2016). These metrics mirrored attendance and completion data and have enabled researchers to assess this form of education.
Advancements in technology have enabled the application of data-mining techniques and AI to the analysis of MOOCs. Some studies of MOOC performance have analyzed the language used in discussion forums to make predictions. Other researchers have used natural language processing (McNamara et al., 2015;Wen et al., 2014). More recently, these techniques have been used to identify student sentiment among MOOC enrollees (Moreno-Marcos et al., 2018;Pérez et al., 2019).
Due to its numerous advantages, AI has been increasingly applied in education. First, AI techniques have improved lecturers' understanding of learning and teaching, and facilitated the design of new systems that help learners gain new skills or grasp new concepts (du Boulay, 2016). Therefore, the application of AI to large MOOC datasets has drawn substantial attention. Second, Fauvel et al. (2018) proposed that AI tools could be used to better understand MOOC participant sentiment, and that MOOC instructors use these data to deliver better courses and develop more useful educational tools. AI could also be used to analyze student learning effectiveness by using records of learning behaviors. Some AI tools have been applied to make online learning more similar to its offline counterpart in order to help students better achieve their learning goals. Because of the variety in student learning adaptability, habits, and behavior, personalized service in MOOCs has been seen as especially important (Tekin et al., 2015).
Although there has been an increasing interest in artificial intelligence in educational research, less than five percent of such studies have addressed deep learning in education. However, given the rapid advance of deep learning, application of it in education is seen to have dramatic potential (Chen et al., 2020). Therefore, in terms of future research, the system examined in this study, since it is based on deep learning, could be a useful example of developing such a system for predicting student performance.
One of the challenges of lecturing in a MOOC is accurately understanding the learner experience. It has proved impossible to keep track of all posts and interactions of the numerous enrollees. The analysis of individual learner experience is critical for course evaluation. According to Donath (1996) learner comments and actions indicated their sentiments and concerns toward a course. Without the appropriate analytical tools, it has been difficult to understand differences in learner sentiment and experiences across different learner groups in a large class.
This study proposed a method for evaluating students' satisfaction by using machine learning. In this method, the learning behavior of participants within the course was used as input for the model, and compared with the results of a survey of MOOC students. The method focused specifically on certain MOOC features students considered important. Thus, educators can use the findings of this research in order to modify their MOOCs to increase student satisfaction and enhance the student learning experience.
Training data for the model came from MOOCs at National Tsing Hua University (NTHU). Logs of learning activities, such as video-watching behavior and exercise completion, were collected and transformed to measures of learning behavior in the model. The proposed model used a deep neural network (DNN) with regression. The result predicted by the DNN was compared with survey responses to evaluate the accuracy of the model. These findings helped us evaluate MOOC learner satisfaction, and aided the design and execution of MOOC lectures.

Student Feedback
Student feedback to the courses is one of significant indicators in both face-to-face and online courses.
Due to the availability of educational big data, Gameel (2017) analyzed data collected from 1,786 learners enrolled in four MOOCs. Learners perceived that the following aspects influenced learning satisfaction: learner-content interaction, as well as the usefulness, teaching aspects, and learning aspects of the MOOC. From learners' perspective, those aspects offer valuable insights into understanding the quality and satisfaction of the MOOCs.
To date, MOOCs have not provided participants (i.e., educators or learners) with any form of timely analysis on forum content. Consequently, educators have been unable to reply to questions or comments from hundreds of students in a timely manner .
Because feedback has been too general, incomplete, or even incorrect, automation may be a solution to this problem. Automatic techniques include (a) functional testing, where feedback is usually insufficient as a guide for novices; (b) software verification for finding bugs in code, which may confuse novices because these tools often ignore true errors or report false errors; and (c) comparisons using reference solutions, in which many reference solutions or pre-existing correct submissions are usually required.
One study used a semantic-aware technique to provide personalized feedback that aimed to mimic an instructor looking for code snippets in student submissions for a coding MOOC (Marin et al., 2017).
Moreover, some researchers take advantage of machine learning to analyze the feedback from MOOCs (Hew et al., 2020). Several deep learning models are used to predict student performance, such as dropout prediction (Xing & Du, 2019) or grade prediction (Yang et al., 2017). To make the accuracy higher, precise big data analysis is also a critical direction thing to MOOC. Some researchers want to analyze video watching data precisely .
Higher education institutions and experts have had a strong interest in extracting useful features pertaining to the course and to learner sentiment from such feedback (Dohaiha et al., 2018;Kastrati et al., 2020). It is thus imperative to develop a reliable automated method to extract these sentiments when dealing with large MOOCs (Sindhu et al., 2019). For instance, Lundqvist et al. (2020) evaluated student feedback within a large MOOC. Their dataset contained 25,000 reviews from MOOC users. The participants were divided into three groups (i.e., beginner, experienced, and unknown) based on their level of experience with the topic. The researchers used the Valence Aware Dictionary for Sentiment Reasoning as an algorithm for sentiment analysis.

Course Evaluation
Several studies were instructive sources for the design of the questionnaire used in this research. Durksen et al. (2016) used cutting-edge methods to analyze students' satisfaction in a learning environment. They examined educational and psychological aspects of traditional and MOOC learning settings to compare outcomes (e.g., students' characteristics, course design). This psychological perspective postulated that the basic needs for autonomy, competence, relatedness, and belonging characterized learner experiences in MOOCs (Durksen et al., 2016).
Other studies have focused on workload and precisely quantified students' workload. In one study, the workload of medical students was quantized using a specifically developed and self-completed questionnaire (Gonçalves, 2014). Additionally, Çakmak (2011) designed a method to quantify instructor style, including factors such as making clear statements, using one's time effectively, and using technology. Çakmak referred to student positivity towards instructor style as style approval. Marciniak (2018) also described effective methods for assessing course quality, which encompassed dimensions evaluating all aspects of the program.

Research Design
Below, we first describe the data collection process in terms of course information and learning behavioral data used in this study. Examples of schema of video and exercise from the platform are also shown to indicate the data structure. Then, we report the design and content of student questionnaire with the response rate of each course. Finally, how data is extracted from the learning activity logs to formulate the predictive model is illustrated with performance evaluation measure.

Course Information
To avoid bias, different types of MOOC courses offered by NTHU in February 2020 were selected: Students in these MOOCs were expected to spend three hours each week watching online videos and completing practice exercises. They were also expected to discuss the course content with their peers.
For Introduction to IoT, students were also required to conduct experiments in some offline laboratory sessions.

Collection of Learning Behavior Data
Videos are the primary teaching method for most MOOCs. In this study, we collected data on video playback actions, such as playing, pausing, seeking, and adjusting the playback speed (Table 1) as well as data on each user's answers for each exercise (Table 2). If a student entered the exercise page but did not answer the exercise questions, we coded the student's response to the exercise as No. The feature timeCost (in seconds) was defined as how long the student took to answer each question. For example, if a student spent 10 seconds answering a question, the timeCost value for that student for that question was 10. The 308,517,712 learning behavior data was transformed into meaningful features as input of the DNN model. We sorted all course data into the categories of training data, validation data, and testing data according to the ratio of 0.64, 0.16, and 0.2.

Survey Questionnaire
Referencing the literature, we focused on the following five categories of student sentiment survey: (a) workload, (b) need fulfillment, (c) intelligibility, (d) style approval, and (e) student engagement. The questionnaire had 22 items in total (Table 3). This research used five-point Likert scale to evaluate the answers provided by students. Rating 1 indicated their worst experience while rating 5 indicated their best experience.
Of the 6,016 students enrolled in the aforementioned courses, 993 filled out the questionnaire, and 764 of the 993 responses were valid. The questionnaire response rates for each course are reported in Table   4; the Cronbach's alpha was 0.842. The response rates for the various courses ranged from 5% to 15%, a result strongly correlated with the number of students completing their MOOCs (Jordan, 2015).
Introduction to IoT had the highest response rate (45%) due to the requirement for learners to attend in-person experiment sessions. This course is related to my major.

Intelligibility
Ochando (2017) The teacher's style helps me easily understand the content.
The teacher is able to explain the key points and clarify confusing points.
The teacher's method is too disorganized for me to keep up.
The teacher is unclear, and I have difficulty understanding.
The teacher's methods make me feel that this course is an efficient way to learn.

Style approval
Çakmak (2011) The teacher's style makes me eager to learn.
The way the teacher speaks makes me feel a little hesitant.
The teacher's tone does not make me feel irritated.
The teacher's rhythm puts me at ease.
The teacher's methods make me feel pressured.

Student engagement
Marciniak (2018) I watched the course videos at least once before the end of the course.
I review the exercises by myself offline.
I will find related videos about unfamiliar concepts.
I will rewatch videos to review unfamiliar concepts.   (Table 5).

Table 5
Video and Exercise Features

Prediction of Questionnaire Score Based on Learning Behaviors
Every student has a unique learning mode and unique learning behavior, and we hypothesized that these would affect their satisfaction. To verify this hypothesis, we inputted the student learning behavior variables into a five-layer DNN model (see Figure 1 for illustration), which used a rectified linear unit activation function to predict the satisfaction score. When creating predictions of student satisfaction for MOOCs, it is crucial to avoid inaccuracies caused by sparse data (Yang et al., 2018). To avoid this problem, input for our system included only the learning data of students who completed the questionnaire.

Performance Evaluation for the System
The mean absolute error (MAE) was used to evaluate the performance of the model. In brief, we used holdout cross validation to obtain the test data, and the data were then used to calculate the MAE as follows: where fi and yi are the predicted and actual scores for student i, respectively, and N is the number of students. The MAE is the difference between the predicted and actual scores, with a lower MAE indicating superior predictive performance.

Results and Discussion
The effectiveness of our prediction model was evaluated in terms of the MAE by using the data from the Table 4 courses. Table 6 shows the MAE output for the answer to each question from the questionnaire.
Our model performed best when computing the answers to questions related to course health, and worst when computing the answers to style approval questions. The results indicated that learning behavior is affected to some degree by student satisfaction. The MAEs shown in Table 6 ranged from 0.41 to 0.55.
This may result from our use of a five-point rating system. The predicted data successfully captured the trend of real data, as shown in Figure 2.

Example of Satisfaction Score Prediction Model Based on Learning Behaviors
Note. This figure is made based on Question 5-1.
Once students' answers to the questionnaire survey were collected, the five categories of results were computed by the overall answer score for each part of the questionnaire. Subsequently, we collected data on the learning behavior of participating MOOC students. Thereafter, these data were analyzed and used to predict the student satisfaction scores.
The results demonstrated that this system enabled teachers to understand multiple aspects of learner satisfaction before the end of the course. Additionally, because course evaluation surveys have high nonresponse rates (Table 4) this system was useful as an alternative method of providing lecturers with feedback predictions for students who do not fill out questionnaires. On the basis of the predicted feedback, teachers can adjust the content, workload, teacher-student or student-student interactions during the course. Compared with the conventional approach, which is disadvantaged by insufficient learner responses and where feedback is given only after the course, our method was more flexible and accurate.
Before the end of the course, the instructor can also use different approaches to track student performance and thus help students by adjusting the course schedule, offering more office hours, or allocating more time to covering more difficult topics. In addition, this system may provide students a chance to reflect on their own performance based on the predictions.
In the future, this system could be combined with a learning log feature. Teachers would then use the student's learning history to better understand their status, and so develop more sophisticated and efficient interactive teaching methods, improve course quality, and increase student satisfaction.

Limitations
The data used as input was collected from the courses in Table 4. Differences between these courses may affect the accuracy of our model. Future research might divide courses into categories to investigate subject matter-related effects. For example, the difficulty of a course may influence student concentration. Researchers can also use different methods to analyze the survey responses.

Conclusion
Education is foundational to a well-functioning society. Due to recent technological advancements, techniques from big data are now available for increasing the quality of courses. To properly use big data, researchers have adopted AI to investigate topics related to education. Through data analysis, processing, and prediction, AI can support lecturers in solving problems and making decisions. In combination with MOOCs, AI can help teachers create a better learning environment and enable students to achieve their learning goals-the common aim of all mainstream MOOC platforms.
In this study, we proposed a method to solve the problem of low MOOC student survey response rates, which prevents teachers from evaluating learner satisfaction in their courses. We established a system that predicted student course satisfaction based on their learning behavior. Our system was tested with student data from NTHU's MOOC platform. These data pertained to student behavior when watching videos and answering exercise questions. Subsequently, a deep learning model was used to process the data and produce a predicted level of course satisfaction for a given student. If the output is made viewable by students, this system may also give them a chance to reflect on their course performance based on the system's predictions.
Lastly, this system can benefit both lecturers and learners. Teachers can track student course satisfaction and learners can give instant feedback on course modifications. If a lecturer receives prompt feedback that guides course modifications, lecturers can better react to student input. Therefore, our system is an innovative method for improving interaction between teachers and students.