Investigating MOOCs Through Blog Mining

MOOCs (massive open online course) is a disruptive innovation and a current buzzword in higher education. However, the discussion of MOOCs is disparate, fragmented, and distributed among different outlets. Systematic, extensively published research on MOOCs is unavailable. This paper adopts a novel method called blog mining to analyze MOOCs. The findings indicate, while MOOCs have benefitted learners, providers, and faculty who develop and teach MOOCs, challenges still exist, such as questionable course quality, high dropout rate, unavailable course credits, ineffective assessments, complex copyright, and limited hardware. Future research should explore the position of MOOCs and how it can be sustained.

. As one of the two most emerging developments in educational technology, MOOC and tablet computing (New Media Horizon, 2013), MOOCs is the buzzword of 2012 in higher education (Daniel, 2012). The fast development of MOOCs attracts many reports and debates among educators. So far, a large volume of press articles and blogs has covered MOOCs. However, discussions of MOOCs are disparate, fragmented, and distributed among different outlets. Systematic and extensive published research on MOOCs is still unavailable (Daniel, 2012;Clow, 2013).
Since blog posts are the main sources of discussion about MOOCs at this stage, this paper adopts a novel research method, called blog mining, to analyze what themes and trends about MOOCs can be found. The goal of this research is to synthesize related discussions in blogs, to provide an in-depth review of MOOCs, and to identify the challenges and future trends of MOOCs. This paper hopes to aid MOOC providers and higher education institutions that might be interested in joining MOOCs to understand what is going on in this fast-moving field. It will offer necessary insights and tips so stakeholders can become knowledgeable about what drives the rapid expansion of MOOCs and the issues they are facing.

Background
In an age of global competition, information glut and rapid technological changes require learners to become informed on how to retrieve, organize, and evaluate information, how to construct knowledge, and how to develop the ability to work in teams (Mioduser, Nachmias, & Forkush-Baruch, 2008;Schrire & Levy, 2012). Due to the advance of information communication technologies (ICTs), the quality of online delivery platforms has improved in recent years. Online activities closely related to social media, such as discussions, blogs, and video lectures, can be easily embedded in online learning (Skiba, 2012). As an extension of existing online learning approaches (Yuan, Powell, & Cetis, 2013), MOOCs is a model to deliver learning content of a course online to anyone who wants to take it (Educause, 2013). By taking advantage of various web-based technologies, including video presentations, computer-based assessments, and online communication forums, MOOCs allows a large number of learners to access course content, formative and summative assessments, and supports from their fellow learners (Balfour, 2013). It is "a dynamic learning model that offers collaborative and social engagement opportunities for learners to construct knowledge" (Skiba, 2012, p. 416 & Rodriguez, 2011).

In 2008, Siemens and Downes offered the first MOOC -" Connectivism and Connective
Knowledge" (Yuan, Powell, & Cetis, 2013). This is a type of asynchronous online learning, which can involve a large number of learners and flexibility for different levels of learners. What makes it unique is that it is free and open to anyone who has Internet access. The creators believed a free course could bring the best education in the world to the most remote corners of the planet, help people in their careers, and expand intellectual and personal networks (Pappano, 2012). This belief seems to be proven by a MOOC called "artificial-intelligence", launched by a Stanford professor, Sebastian Thrun, in 2011, which attracted 160,000 learners in 190 countries (Lewin, 2012). Since MOOCs has been booming in recent years, it plays an increasingly important role in higher education around the world (Meyer & Zhu, 2013).
MOOCs represents an emerging methodology of online teaching and an important development in open education. Its structure was inspired by the philosophy of connectivism and implementation requires conceptual changes in perspectives from both facilitators (tutors) and learners (Rodriguez, 2013). It is "based on the explicit principles of connectivism (autonomy, diversity, openness, and interactivity) and on the activities of aggregation, remixing, repurposing and feeding forward the resources and learning" (Rodriguez, 2013, p. 1). MOOCs has two distinct branches: (1) connectivist MOOCs (cMOOCs) and (2) a more formal MOOCs (xMOOCs) (Hill, 2012). The pedagogies behind these two branches are different. cMOOCs is built on connectivism (Kop & Hill, 2008;Siemens & Downes, 2008), which is a sophisticated and innovative reconceptualization of what it means to know and to learn. In contrast, xMOOCs is based on behaviorist pedagogy that relies on information transmission (Bates, 2012 MOOCs is the current buzzword in higher education. Because it is a disruptive innovation (Skiba, 2012), it initiates many discussions about higher education. Although its future is not clear yet, a number of MOOC platforms have been developed and offer courses independent of or in collaboration with universities (Yuan, Powell, & Cetis, 2013). In 2012, some elite universities lined up to join forces with MOOC providers (Lewin, 2012). For example, Coursera began with Princeton, the University of Pennsylvania, Stanford, and the University of Michigan. The University of California, Berkeley, and the University of Texas joined edX (Lewin, 2012). Despite the fast development of MOOCs, limited research or evidence is available to support either the positive or the negative opinions about them (Skiba, 2012).

Method
MOOCs is an innovative way of teaching and learning (Meyer & Zhu, 2013 mining analysis can improve the timeliness and relevance of this study (Chau & Xu, 2012). Figure 1 shows the steps of a blog mining process. Google Blog Search (http://www.google.com/blogsearch) is specially designed to retrieve content from blogs that are freely and publicly available on the Internet. In this study, a query search was conducted first by applying the advanced search option of Google Blog Search with the keyword "MOOC". To identify the latest blog content discussing MOOCs, the query time period was set from January 1, 2010 to June 31, Step 2 Step 3 Step 4 Text Preprocessing The blog posts were manually copied and saved as a text file for further analysis. Data pre-processing was conducted next via manually going through all the blog posts. This process determined five irrelevant or redundant blog posts for removal. The remaining posts were utilized as a finalized sample data set that provided a glimpse into the ongoing concerns and discussions associated with MOOCs. In the next step, a concept analysis and mapping (CAAM) technique was adopted to analyze the content of the remaining blog posts, because CAAM has been proven an effective research technique for studying textual written statements (Jackson & Trochim, 2002). In particular, CAAM software called Leximancer (www.leximancer.com) was utilized to load the blog content, to extract and classify the key concepts and themes, and to identify the patterns and relationships between concepts and themes. Leximancer is text mining software that can be used to analyze the content of collections of textual documents and to visually display the extracted information in a browser.
The information is displayed by means of a conceptual map that provides an overview of the material, representing the main concepts contained within the text and how they are related. (Leximancer, 2010, p. 1) Leximancer is based on Bayesian theory, which argues fragmented information can be used to predict what occurs in a system (Watson, Smith, & Watter, 2005 Leximancer looked for words that appeared most frequently in the loaded data and then generated a list of concepts. These concept terms were further clustered into themes, based on their relationship to each other. Next, clusters of concepts were grouped by themes named after the most prominent concept in that group. The themes were displayed as large circles on a concept map, which represented the strength of association between concepts and provided a conceptual overview of the semantic structure of the data (Cretchley, Rooney, & Gallois, 2010;Martin & Rice, 2011). Concept terms were displayed as spots in the large circles. The large theme circles were heatmapped to indicate their importance. For example, the most important theme appeared in red and the second hottest in orange and so on, according to the color wheel (Leximancer Manual, 2011).

Results
Leximancer produced several types of concept maps that indicated the extracted concepts from the sample data set and their interrelationships. An example of concept maps generated by Leximancer from the sample data is shown in Figure 2. Leximancer generated a report that listed the themes and concept terms using its text analytics algorithms. Several closely linked concepts form a cluster and are displayed as dots inside circles. The closer the distance between concepts, the stronger they are semantically linked. Themes (clusters of concepts) are represented as circles. Their  To better explain Figure 2, Table 2 lists each theme and the details of its concept terms.
Themes are related with circles in Figure 2, while concept terms are related with the dots.

Discussion
Compared with traditional classroom-based learning, MOOCs is an innovative way of teaching and learning (Meyer & Zhu, 2013). This blog mining shows a number of elite higher education institutions around the world have provided MOOCs. Although the trend is unclear, MOOCs has brought big impacts to higher education. A detailed discussion is presented next. Benefits for Learners Table 2 shows students and people are both themes in the blog mining results. As a disruptive innovation, MOOCs provides learners with a lot of benefits. MOOCs is open to any person who has access to the Internet. It provides free online courses and makes higher education accessible to a global audience (Meyer & Zhu, 2013). Learners around the world can enroll in MOOCs without any cost. They can even take courses from top universities, as more elite higher education institutions provide MOOCs (Lewin, 2012).
They do not need to go to campus or pay expensive tuition for taking courses from top ranking universities. This is a great benefit for learners in developing countries, where high quality, higher education resources are limited. Even in developed countries, MOOCs allows middle class families to offset their high college tuition rates (Thrift, 2013).
MOOCs is a great mechanism for lifelong learning (Skiba, 2012), and users range from teenagers to retirees (Pappano, 2012). According to Belanger and Thornton (2013), learners take MOOCs for the purpose of gaining an understanding of the subject matter, increasing social experience and intellectual stimulation, taking advantage of the convenience,overcoming barriers to traditional education options, and exploring online education. MOOCs is the right learning mode for people looking for extra learning by maximizing their time. This allows self-motivated learners to craft their own educational path by accessing the knowledge, lectures, quizzes, homework, exams, and personalities of the best professors at the top universities in the world (Raza, 2013).
Even in-class students can benefit from the online materials in MOOCs. In some MOOCs, in-class students and MOOC students take classes together. Some professors rearrange their courses to allow their students to complete the online lessons first and come to class later for interactive projects (Lewin, 2013). Such an arrangement allows in-class students and MOOC students to interact with each other. The interaction is very helpful for improving learning effects.

Benefits for Providers
MOOCs makes it possible for everyone to access higher education, so it has generated significant interest from policy-makers, higher education institutions, and commercial organizations (Yuan, Powell, & Cetis, 2013). Carey (2013) argues that MOOCs helps higher education policy-makers to address budget constraint problems and to lower the cost of degree courses by experimenting with inexpensive, low-risk, higher education forms. Institutions have been involved in engaging and experimenting with MOOCs to expand access to higher education, achieve marketing and branding, and develop potential new revenue streams (Yuan, Powell, & Cetis, 2013). Commercial organizations provide a platform based on MOOCs and develop partnerships with institutions to enter the higher education market and to explore new delivery modes in higher education (Yuan, Powell, & Cetis, 2013). Other than the above stakeholders, faculty who teach MOOCs should not be neglected.
MOOCs may be prompting some faculty to pay more attention to their teaching styles.
It provides faculty an opportunity to learn from dedicated and successful teachers and re-examine their own pedagogical practices so that they can maintain or improve high quality interactions between themselves and students, in face-to-face courses and online courses. As Bali (2013)  MOOCs is confronted with a series of challenges regarding these three characteristics, such as questionable course quality, high dropout rate, unavailable course credits, limited learning assessment methods, puzzling copyright, and limited hardware.

Questionable course quality.
As mentioned above, the elite universities are rushing into MOOCs for the purpose of expanding access to higher education, marketing and branding, and developing new revenue streams. Are the MOOC courses they provide of good quality? Maybe some are not. As Daniel (2012) argues, even though the elite universities actively involved in MOOCs gained their reputations in research, they may or may not be talented in teaching, especially teaching online. In other words, research is different from teaching.
That these elite universities make great achievements in research does not mean that they are capable of offering high quality online learning courses.
Another concern comes from the resources used to support the quality of MOOCs. High quality MOOC courses need huge investments. However, according to Yuan, Powell, and Cetis (2013), it is unclear how MOOCs will make money now and in the near future.

Moreover, the huge number of learners in MOOCs causes big troubles for the interaction between instructors and learners. Usually social media is used widely by
MOOCs for learner discussions. Since the number of learners in one single MOOC course is large, it is very difficult, maybe impossible, for the instructor to monitor all course discussions, interact with each learner, and provide feedback (Pappano, 2012;Clow, 2013). The lack of interaction between MOOC instructors and learners will definitely damage the course quality. In addition, the diversity of learners in a MOOC causes the lack of a common knowledge base and educational background among them (Pappano, 2012). As such, when learners post discussions about the course content or other related topics, these discussions might not be very fruitful. Because fruitful discussions are important components in the learning process, learners will not benefit much from such discussions. As a result, the course quality will be damaged by the lack of a common knowledge base and educational background among MOOC learners.

High dropout rate.
MOOCs has substantially higher dropout rates than traditional education (Clow, 2013).
Only about 10% of the learners who enroll in the largest MOOCs actually complete the course (Daniel, 2012;Sandeen, 2013). Scholars have tried to determine the reasons. For example, Clow (2013) adopts the funnel of participation to explain the high dropout rates in MOOCs. He borrows the idea of "purchase funnel" from the field of marketing and sales, and separates learners' experiences in MOOCs into four steps: (1) awareness, (2) registration, (3) activity, and (4)

Unavailable course credits.
Few colleges or universities offer full course credit to students who complete a MOOC (Meyer & Zhu, 2013). Many professors teaching MOOCs think students do not deserve course credit for completing a MOOC (Kolowich, 2013). The concerns for course credit are mainly about course quality and the assessment of learning (Meyer & Zhu, 2013).
According to Lederman (2013), only five of Coursera's courses are approved for course credit by the American Council on Education. However, the acceptance of MOOCs for credit hours is growing. Currently, some MOOC providers charge fees for certificates and some have begun to offer credits. For example, the University of Washington offers students college credits for some of its courses, if they take them through Coursera, pay a fee, and complete the additional assessments (Long, 2012). The Colorado State University's Global Campus gives three credits for students who complete a free course offered by Udacity and pass a proctored test (Lewin, 2012;Skiba, 2012). Companies that offer MOOC platforms, such as Coursera, EdX, and Udacity, are growing (Skiba, 2012).
However, Porter (2013) argues that MOOCs is more like "learning tutorials" or "online interactive workshops" than "college courses." Does MOOCs have to be connected with credits? The answer remains unclear. Yuan, Powell, and Cetis (2013) argue that since most learners using MOOCs are people who already have a degree, it is not important whether a MOOC carries credit. This argument raises the debate about MOOCs and degrees. Daniel (2012) indicates what decides whether or not a student can obtain a degree is determined not by their mastery of the course, but by the admissions process to the university. So, he argues that the completion of a MOOC should not be connected with credits, which are towards a degree qualification.

Ineffective assessments.
Conducting effective assessments in a MOOC is a big challenge so far. On one hand, as a type of asynchronous online learning, MOOCs inheres security risks on the Internet.
On the other hand, the number of available effective assessment methods is limited. The development of technology makes diverse cheating methods available for online assessments. According to a study completed by King, Guyette, and Piotrowski (2009), 73.6% of the students think it is easier to cheat in an online environment than in a conventional one. Methods of cheating with online assessments include using online communication and telecommunications, Internet surfing (Rogers, 2006), copying and pasting from online sources (Underwood & Szabo, 2003), obtaining answer keys in an illegitimate way, taking the same assessment several times, and getting unauthorized assistance (Rowe, 2004). Other means of cheating on online tests include someone other than the actual student taking the online test and copying answers from elsewhere (Sasikumar, 2013). Therefore, MOOCs needs effective assessment methods that can perform user validation and prevent plagiarism (Cooper & Sahami, 2013). For now, how to ensure the right person is taking a test with the correct materials remains a challenge. To mitigate this risk, some MOOCs providers offer proctored exams. Most of them are making plans to charge fees for such service (Lewin, 2013). For example, to validate students who are taking proctored exams, Coursera, edX, and Udacity tries to set-up partnerships with Pearson so MOOCs learners can take in-person examinations in Pearson testing centers (Parry, 2012;Udacity, 2012;Yuan, Powell, & Cetis, 2013). Other than proctored exams, biometric authentication seems to be a solution for validating learners (Wang, Ge, Zhang, Chen, Xin, & Li, 2013).
Because MOOCs relies heavily on computers, assessment methods that can be easily implemented by computers are used widely in MOOCs, including multiple choice questions, formulaic problems with correct answers, logical proofs, computer codes, and vocabulary activities (Cooper & Sahami, 2013). However, none of these methods is good for assessing written work. So far two mechanisms have been adopted to evaluate essay assignments: (1) machine-based automated essay scoring (AES) and (2) calibrated peer review (CPR) (Balfour, 2013). But due to the limited capabilities of these two mechanisms, assessment methods implemented by computers are adopted widely in MOOCs.

Complex copyright.
Who is the owner that holds the copyrights for a MOOC course? The answer remains unclear because copyrights for a MOOC course are multifaceted. On one hand, copyrights for a MOOC course involve faculty, learners, universities, and MOOCs providers (Porter, 2013). Thus, MOOCs presents complex copyright issues that could challenge the relationships between a higher education institution, its faculty and learners, and MOOCs providers (Educause, 2013b). On the other hand, materials adopted in MOOCs are in diverse formats and they can be generated by either faculty or learners, or both. To date, a university can first offer a MOOC course with the best of intentions and then offer it via a MOOC provider. It is very likely that the MOOC provider makes profits by selling the MOOC course to other universities. Such a transaction raises the question: Should the university creating the MOOC course get rewards (Creelman, 2013)? In addition, MOOCs providers could violate the common institutional policy approach by establishing a proprietary claim on materials in its courses, licensing to the users the terms of access and use of those materials, and establishing its ownership claim of user-generated content (Educause, 2013b). Most materials in MOOCs, such as syllabuses, course policies, lecture videos, assignments, quizzes, class activities, and schedules, are developed by faculty (Porter, 2013).
Therefore, according to the common institutional policy, copyrights for a MOOC course should belong to faculty who develop it, not MOOC providers. As such, Porter (2013) argues that faculty should be careful to understand the laws, policies, and contracts regarding copyrights when they develop MOOCs. However, learners who generate content for MOOCs should not be neglected. Some MOOCs require learners to submit  (Cooper & Sahami, 2013), not to mention that many MOOCs learners are in developing countries and have limited access to the Internet. This hardware limitation needs to be overcome to make MOOCs accessible to more learners.

Trends
As a disruptive innovation, MOOCs transform higher education (Shirky, 2012

Limitations of this Study
Blog mining is a novel method to synthesize related discussions in blogs to provide an in-depth review of MOOCs, and to identify future trends and challenges of MOOCs. It is well suited to MOOCs research, where existing academic studies are not adequate.
However, blog posts can have an inherent bias. For example, the information on blogs is not peer-reviewed; authorship of some blog pages is either unclear or unknown; and some blog information might be posted for commercial purposes (He, 2013).
Furthermore, the process of analyzing clusters and themes is subjective.