Are the Most Highly Cited Articles the Ones that are the Most Downloaded? A Bibliometric Study of IRRODL

Publication of research, innovation, challenges and successes is of critical importance to the evolution of more effective distance education programming. Publication in peer reviewed journal format is the most prestigious and the most widespread form of dissemination in education and most other disciplines, thus the importance of understanding what is published and its impact on both researchers and practitioners. In this article we identify and classify the leading articles in arguably the leading peer reviewed journals in this discipline. The journal The International Review of Research in Open and Distance Learning (IRRODL) is a peer reviewed academic journal that has been published since 2000. The journal has published between 3 and 6 issues annually with between 50 and 111 research articles per volume. In order to assess the general and the particular impact of highly cited articles this work describes the main bibliometric indicators of the IRRODL journal and these are compared with the total galley views in all formats, PDF, HTML, EPUB and MP3, that IRRODL publishes. In addition to identifying characteristics of the most widely cited articles this research determines if there is a correlation between the articles most highly cited by other publishing researchers and the number of views, indicating interest from both practitioners and research communities. The results show a significant and positive relationship between the total number of citations and the number of views received by articles published in the journal, indicating the impact of the journal extends beyond active publishers to practitioner consumers. Are the Mostly Highly Cited Articles the Ones that are the Most Downloaded? A Bibliometric Study of IRRODL Avello Martínez and Anderson This work is licensed under a Creative Commons Attribution 4.0 International License. 19


Résumé de l'article
Publication of research, innovation, challenges and successes is of critical importance to the evolution of more effective distance education programming. Publication in peer reviewed journal format is the most prestigious and the most widespread form of dissemination in education and most other disciplines, thus the importance of understanding what is published and its impact on both researchers and practitioners. In this article we identify and classify the leading articles in arguably the leading peer reviewed journals in this discipline. The journal The International Review of Research in Open and Distance Learning (IRRODL) is a peer reviewed academic journal that has been published since 2000. The journal has published between 3 and 6 issues annually with between 50 and 111 research articles per volume. In order to assess the general and the particular impact of highly cited articles this work describes the main bibliometric indicators of the IRRODL journal and these are compared with the total galley views in all formats, PDF, HTML, EPUB and MP3, that IRRODL publishes. In addition to identifying characteristics of the most widely cited articles this research determines if there is a correlation between the articles most highly cited by other publishing researchers and the number of views, indicating interest from both practitioners and research communities. The results show a significant and positive relationship between the total number of citations and the number of views received by articles published in the journal, indicating the impact of the journal extends beyond active publishers to practitioner consumers.
arguably the leading peer reviewed journals in this discipline.
The journal The International Review of Research in Open and Distance Learning (IRRODL) is a peer reviewed academic journal that has been published since 2000. The journal has published between 3 and 6 issues annually with between 50 and 111 research articles per volume. In order to assess the general and the particular impact of highly cited articles this work describes the main bibliometric indicators of the IRRODL journal and these are compared with the total galley views in all formats, PDF, HTML, EPUB and MP3, that IRRODL publishes. In addition to identifying characteristics of the most widely cited articles this research determines if there is a correlation between the articles most highly cited by other publishing researchers and the number of views, indicating interest from both practitioners and research communities. The results show a significant and positive relationship between the total number of citations and the number of views received by articles published in the journal, indicating the impact of the journal extends beyond active publishers to practitioner consumers.

Introduction
The International Review of Research in Open and Distance Learning (IRRODL) is a peer reviewed academic journal that has been published continuously since 2000. During that time the journal has published 812 research articles. The focus of IRRODL is international open and distance learning, with some special issues having a regional focus and other topical issues such as connectivism, right to education, mobile learning etc. Since 2006 the journal has been published using Open Journal Systems and thus all the views and download data from the published documents are logged as historical data and used in this research.
After a six year battle for acceptance (Anderson & McConkey, 2009)  This work describes the main bibliometric indicators of the IRRODL journal and then discusses and calculates the relationship between the impact as judged by citations by other researchers and by practitioners as judged by the number of total galley views (all formats: PDF, HTML, EPUB and MP3). Finally, a correlation between the total citations and views (downloads) as an indicator of altmetrics received by articles published in the journal is presented.

Citation Analysis
Citation analysis is a research method used to assess the impact of contributions of individuals, institutions, research groups and academic journals. This kind of analysis is applied in most scientific fields including social science and it allows us, as Kinshuk, Huang, Sampson, and Chen (2013) pointed out, to observe how frequently a document has been cited by other authors, providing one way to calculate the relevance and importance of an author, an idea or a particular document.
The use of citation analysis has grown into the most popular way to evaluate the impact and importance of published papers, books and other academic documents using the number of citations in peer-reviewed journals (Bornmann & Daniel, 2008) as an indicator. Nevertheless, this topic has been controversial and developing objective "scientific" measures of impact remains 20 elusive. Thus, different ways to measure the number of citations and more fundamentally the value to a discipline and profession of different types of research have been developed.
Specifically, in the educational technology and distance education field, many related studies were found as well. For example, Klein (1997) studied 100 articles published in the journal Educational Technology Research and Development between 1989 and 1997. Rourke and Szabo (2002) performed a content analysis of 235 articles from the Journal of Distance Education focused on item type, topic, research method, and biographical information about first authors, and the main aspect to be highlighted here is that only one trend was found for the category item type in which an uncertain ascending trend is apparent in the proportion of empirical items, thus indicating a broad diversity of item types.
Zawacki-Richter, Anderson and Tuncay (2010) Halverson et al. (2014) determined the most frequently cited books, edited book chapters, and papers on blended learning. This identification of key articles serves as a filter to help others prioritize their reading and research and also helps users identify key players and ideas to investigate further. Furthermore, these studies illustrate a consistent interest in this field and in the value of such journal bibliographics.

Google Scholar
The release of Google Scholar (GS) in 2004 generated much media coverage and academic debate (Giustini, 2005). Google Scholar indexes a wide range of information including peer-reviewed articles, theses, books, chapters, conference proceedings, preprints and other documents from academic publishers (Gehanno, Rollin, & Darmoni, 2013). Thus, the materials it uses to calculate citation indexes is much broader in scope, though less exclusive, than other bibliographic systems which index only selected peer-reviewed journals. As Beckmann and Wehrden (2012) argue "the 21 coverage of GS is increasing and, despite the fact that it is said to be not exhaustive, it is exhaustive enough for the studies that are considered of enough quality or relevance for systematic reviews".
Google Scholar does not offer the authority structure or transparency of coverage that many librarians and bibliometricians expect from a scientific information resource. However, as Torres-Salinas (2009) propose, it might well be of considerable use for individual academics interested in citation analysis, as well as higher level bibliometric analyses such as citation analysis. Further, its broader coverage may make it more useful to practitioners, professionals and others who wish a broader perspective on their citizen science than that provided by tools designed for full time academics.
Although some features (notably the lack of transparency used by Google in selecting items for inclusion) of Google Scholar have been widely criticized, it has been shown that the journals listed

Objectives and Research Questions
The purpose of this research is to undertake a cross-sectional study of the published papers of the journal IRRODL in the period 2008 to October 2013, taking into account both the number of citations per article from Google Scholar and the total galley views (TGV) supplied by Open Journal Systems. We identify the papers with the most citations by other researchers and viewed these by year of publication, principal authors and authors' country of origin. Moreover, the correlation between citations and total galley views was also calculated, and the Pareto Principle (80-20 rule) is tested to see if it applies to both samples.
Three questions guided the study: • What are the bibliometric indicators of the IRRODL journal in the analyzed period?
• What are the main characteristics of the highly cited and most viewed articles?
• Is there a relationship between citations received and the total galley views?

Sample and Method
The data were extracted from all articles published in IRRODL from 2008 to October 2013.
IRRODL was selected because of its reputation as one of the most important and recognized journals in the field of open and distance learning (Zawacki-Richter, Bäcker, & Vogt, 2009 Note. IRRODL publishes field notes, book reviews, technical reviews and editorials that are not peer reviewed but are included in only the second row of Table 1.

23
The search period was arbitrarily set from 2008 to October 2013 so as to show recent activity but also to allow a few years for articles to be cited and viewed. The total number of times an article is cited and downloaded is related to the length of time since its publication, since totals are cumulative. 401 (92.1 %) of the articles published between 2008 and October 2013 (Vol 9 to Vol 14_3), were retrieved from Google Scholar confirming the fairly extensive coverage of Google Scholar (Harzing, 2010).
To examine IRRODL article citations in Google Scholar we used the software Publish or Perish (www.harzing.com/pop.htm ) version 4.4.6 using the Journal Impact Analysis option (Figure 1) and

24
During the period of time selected for this study, the number of citations per paper ranged from a low of zero to a high of 134 with an average of 8.38 citations/paper. In order to identify the most cited papers by year of publication, principal authors and authors' country, we selected papers that were cited at least 30 times; this resulted in a selection of 33 papers.
Finally, the total galley views (all formats: PDF, HTML, EPUB and MP3) data published by IRRODL was correlated with the number of citations per article. For this task the sample was enlarged to the 100 most cited articles in order to include other articles with important citations data, the range of citations increased to 9 -134. This additional gauge is introduced as a way to determine if the interest in an article by other researchers (thus the citation in a peer reviewed journal) is correlated with interest from practitioners as measured by the number of times the article was downloaded. We acknowledge that the download numbers include requests from both researchers and practitioners, whereas citation counts represent interest by that subset of distance educators who are active researchers. If the articles are meeting the needs of both groups, we would predict strong correlations between them. In addition, the Pareto principle was used to test if 20% of the papers published by IRRODL are responsible for 80% of the citations.

General Bibliometric Indicators
The main general bibliometric indicators yielded from the query using Publish or Perish software from the years 2008 to October 2013 are detailed in Table 2 Table 3 Number of Authors per Paper

Identification of the Main Characteristics in the Highly Cited and Viewed Articles
After selecting the 33 papers with 30 or more citations we found that the citation counts of these articles ranged between 30 and 134. The range of total galley views of the selected papers ranged between 8,020 and 70,441.
There were 64 different authors who contributed to the 33 articles in this study -an average of 1.93 authors per highly cited paper, which is very close to the average of 2.01 authors for all published articles (from Table 2). These highly cited articles came from 11 different countries as seen in Figure 3 based on authors' affiliations. The most common countries identified in the highly cited articles were: United States (24), United Kingdom (12) and Canada (11) with more than 10 authors, followed by Germany, Israel, Turkey, Norway, Italy, Denmark, Bahrain and Australia. There were three contributors each with two articles in the highly cited selection: David Wiley from the United States, Olaf Zawacki-Richter from Germany and Rita Kop from Canada.  29 Table 4 Ten Articles with Most Citations (   As expected, Figure 5 shows that the older articles are both generally cited and viewed more times in total than more recent articles.

Is There a Relationship Between Cites Received and Total Galley Views?
The relationship between the two variables citations and total galley views can be examined visually in Figure 6. The scatterplot enables us to assess graphically the degree of relationship between the characteristics being measured, and in this case we can appreciate a median to high relation between the number of citations and the total galley views.

The Pareto Principle: 80-20 Rule
Finally we were interested in investigating if citations per paper were very unequally distributed across articles and if they followed the Pareto principle. The Pareto principle (also known as the 80-20 rule, the law of the vital few, and the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes" (Newman, 2005). In our example, that Pareto principle applies as seen in Figure 7, in that 80 of the 245 with at least one citation in the 401 papers published (20%) account for 2,714 of 3,371 (80.5%) of citations as is proposed by Newman (2005). This phenomenon means that, as in many other contexts, the vast majority (over 80%) of the total scientific production of the IRRODL journal is accounted for by only 20% of articles published. The total galley views follows the Pareto principle as well -177 of 812 (21.7%) articles, reviews, notes and editorials published account for 1,973,864 of 2,466,137 (80.0%) views. Fortunately, however the long tail of Internet access (Anderson, 2004) coupled with extensive online search services, allows identifying and retrieving all of the hundreds of articles produced -even those with few citations or downloads.

Conclusion
According to Shih, Feng and Tsai (2008), "articles with more citation frequencies are usually those that are better recognized by others in related fields. They probably present more fundamental ideas about the issues for future research" (p. 960). Thus, this research identifies these most cited articles as being important works as measured by their citation by other researchers. As importantly, the data shows a strong correlation between this rate of citations by researchers and interest and popularity as shown by number of downloads by both researchers and practitioners. Given the large (and growing) interest in distance education globally and the 34 importance of research in this field to both researchers and to practitioners, it is both interesting and reassuring to note the strong correlation between the two measures of importance.
We used Google Scholar for this study rather than the more established commercial indexes such as Social Science Citation Index or Scopus, as we feel that the broader coverage of Google Scholar into literature and conference proceedings not indexed by these others represents real interest by our practitioner community of distance educators. Moreover, researchers suggest a significant and positive relationship between both citation in Google Scholar and Web of Science (Ebrahim et al. 2014). We hope that Google Scholar will make efforts to be more transparent about the ways in which citations are counted, but feel that in this context, it represents the most accurate measure of an article's impact in both the research and practitioner communities. Twenty percent of the articles account for roughly 80% of both downloads and citations.
Researchers are also increasingly involved in international collaborative projects as demonstrated by five influential papers which represent 15% of most cited papers. We hope this article helps authors from all countries to recognize the type, format, topic and data collection and analysis methods of the most influential papers, so that the quality of all articles can be improved.