International Review of Research in Open and Distributed Learning Identifying Tensions in the Use of Open Licenses in OER Repositories

Tensions


Introduction
Over the past decade, the movement for Open Educational Resources (OER) has managed to gain substantial popularity.The call for the 2012 "Open Education Conference" proposed that after a decade of work, it was time for a roadmap where "open education moves beyond content" i .The production of OER was, the call offers, surpassed by issues of uptake, collaboration, financial sustainability and other very relevant concerns.For more peripheral actors in the production of content, particularly those not producing content in the English language, the issues of production and visibility are, unfortunately, not a foregone concern.As activists for openness, we must be vigilant to avoid constantly replicating inequalities in terms of those who produce, develop skills and revenue, and actively participate in the 123 commons, and those who are passive observers mostly assimilating the offerings that are made available (Pretto, 2012; see also Zancanaro, Todesco, & Ramos, 2015).This is an issue of equity, which is not just of concern to open and distance educators, but has been a mainstay in other areas, such as multicultural education (Sleeter & Grant, 1994).The one-way flow of English-language content to other groups takes novel forms, such as that of translated or subtitled content (Amiel, 2013;Ochoa, Klerkx, Vandeputte, & Duval, 2011).Examples of this phenomenon include the large-scale translation of content from projects such as Coursera (in Mexico) and Khan Academy (in Brazil) ii .
One important barrier to this development is that the OER movement still lacks awareness of many of the initiatives, actors and organizations involved in open education and OER in many regions of the world iii .
In this research, as part of a larger development project iv , we identified noteworthy OER-related, basic education (K-12) initiatives in Latin America with a focus on Portuguese and Spanish, a cross-section that has had very limited visibility within the OER movement.We aimed to develop a wide panorama of these initiatives providing substantial descriptive data in a mapping platform created in free and open software (Figure 1).We were particularly interested in how licenses were used in these sites, seeing that open licenses are considered a cornerstone of OER and help foster new and emerging open educational practices (Atkins, Brown, & Hammond, 2007;UNESCO/COL, 2011).Creative Commons has become an almost-standard for the open licensing of content by providing a simple, machine and human readable legal mechanism for licensing.A previous investigation on a dozen repositories in Brazil demonstrated a host of issues in how intellectual property rights were visually communicated to the end user, even in the most well established repositories (Amiel & Santos, 2013).The authors found confusing terminology and non-standard symbols, lack of clear policies, and misalignment between the main site terminology and that exhibited by the resources themselves.

124
We contend that licenses are of little practical use, and may even hamper the practice of remix and reuse, if they are not consistently used and clearly communicated to the end user.Based on these findings, we posited that the poor use of open licenses might contribute to confusion and ultimately hamper the visibility and (re)usability of OER.Here we extended the methods used in the study to create an "audit" system that can help identify licensing practices, and has lead us to identify tensions in the OER licensing ecosystem.

Methodology
Our methodology began by identifying activists and experts in open education in twenty-four countries in Latin America: Argentina, Belize, Bolivia, Brazil, Chile, Colombia, Costa Rica, Cuba, Dominican Republic, Ecuador, El Salvador, Guatemala, Guyana, Haiti, Honduras, Jamaica, Mexico, Nicaragua, Panama, Paraguay, Peru, Suriname, Uruguay, and Venezuela.Contacts were compiled from a list created for the Latin American pre-conference to the OER Congress in Paris (2012), which was held in Rio de Janeiro.
We created an online survey asking each participant to identify important repositories in their country, and notable initiatives in other Latin American countries.In order to expand our reach, we also asked each respondent to provide us with the name and contact of another person he or she considered to be knowledgeable about Latin American Open Education initiatives vi .From this procedure we gathered a total of 70 contacts, out of which 23 replied to our queries, from a total of nine countries (Venezuela, Peru, Chile, Colombia, Brazil, Argentina, Mexico, Ecuador, and Uruguay).Contributors were only consulted through the online survey and were provided with the results published at the end of the project vii .Each contributor was credited in the project's public site.
In parallel, we identified initiatives through web searches, repository listings, and published documents.
We filtered those initiatives not aligned with the scope of the project, leaving behind 1) projects focused on higher education; 2) thesis and dissertation repositories; 3) institutional websites that were merely informative; and 4) pages referring exclusively to technical and/or conceptual documentation.This filter resulted in 60 projects.After a final analysis 10 sites were eliminated either because of instability (being often offline) or because the project was no longer available.The final count for the analysis in this article is based on 50 sites (we use the words site, repository, or platform interchangeably).
Data was collected manually by visiting each initiative's page, following a new metadata protocol established in partnership with other mapping projects, so as to create interchangeable datasets (the metadata scheme is available at http://dados.educacaoaberta.org).The scheme included primarily data on country, funders, types of licenses, types of resources, and resource languages.The dataset was then manually mapped to create a geo-tagged library of OER initiatives in Latin America, comprising both governmental, academic and NGO funded initiatives.
We then analyzed the site-licensing scheme at three different levels.First, we noted the existence of an overall site copyright statement; usually part of the footnote or within a unique section within an "about" or "rights" page.We then randomly selected five resources to investigate by navigating, from the main 125 page, towards the final resource.Second, we noted the intermediary page, usually a formal metadata page evident in structured digital library software, which provides extensive descriptive information on the resource.
In other cases, this is a more "informal" access page, such as a listing of resources within a category.
Finally, we downloaded/accessed each of the five educational resources and examined their licenses.In the case of videos, simulations or games (such as Adobe Flash files) we examined if the data were "embedded" in the resource themselves in the form of visual information.For photographs, we attempted to look at embedded metadata (such as IPTC).For documents (such as PDFs), we examined the front and back matters, and footnotes (not metadata) for any mention of licensing information.In the case of web pages (HTML), we examined the overall page to verify alignment.Through this process, we aimed to ascertain if, in this trajectory, there was misalignment in how licenses were portrayed to the user.Each resource was evaluated by one of the researchers and then reviewed by the second one.Issues occasionally found were discussed and resolved through consensus.
In the case of resources that are HTML pages, there is no "embedded" license.Here, the "item" and "page" license are one and the same and were considered as such.There is difficulty in defining these levels precisely, since we are not comparing similar types of underlying systems: we were not comparing similar digital libraries or sites using a single platform (such as Drupal or DSpace), which would make the navigation follow more of a standard.
Still, our main concern was to investigate alignment on all levels of system, and notice contradictions in the navigation towards the resource.These are important to demonstrate cohesiveness and also to make sure that different points of access to the resource might have contradicting information.Since the entry point into a site is not forced on a user, interested parties might reach a resource through a homepage, an intermediate/metadata page, by accessing the resource-page directly through a search engine, or of particular importance in OER (where remix and sharing is encouraged), the resource might be hosted elsewhere.As a result, 250 educational resources in three languages (Spanish, Portuguese and English) were thoroughly analyzed.

Results
Repositories differ, among other things, in their objectives and approaches, which impact not only the structure of the site, but also how users navigate and access the available resources and licensing-related information.We classified the resources we gathered according to an emerging set of criteria, refined over cycles of categorization, looking for a framework through which each and every resource could be consistently systematized.Based on this classification of each resource, we aimed to provide an overall picture of the repository itself.In other words, rather than classifying the repository on an interpretation of its intent, we deduced the goal of the repository first from the actual resources it provided.Through this we aimed to provide a "vocation" for the repository based on the origin of the resources and not a detailed 126 classification or typology of its specificities (Bateman, Lane, & Moon, 2012).Later we expand on possible implications of these categorizations, which are as follows: Exclusive: as far as could be ascertained, through direct attribution or through the appearance of novelty, the repository seems to be the original host of the resource.This is common practice in repositories created by public offices of education.Such definition covers not only new resources developed exclusively for/publicized by the repository, but also remixes produced by the repository's staff.
Linked: the resource is available through a link to another site and is not hosted; this is common practice in curating services.These are usually referred to as referatories (Ochoa & Duval, 2009).
Exclusive and linked categories are similar to the Type 1 (house content "on site") and Type 2 (link to external content, or portals) identified by McGreal (2008).

Aggregated:
The resource is clearly from a third-party site, but is actually hosted in the repository.In navigating, the user never actually leaves the site.This is common practice in some digital libraries, which aim to curate and create collections of resources that are aligned to a specific scope and (ideally) a rights policy.
Contributed: the resource is clearly identified as aiming to accept contributions by third parties to the repository.This is common in teacher-focused/community repositories where contributions are what essentially creates the collections.Submissions are welcomed and encouraged.
Mixed: a combination of the approaches, with no clear tendency we could identify.

Findings
Of the 50 projects, 25 (50%) were catalogued as exclusive.The resources offered by these varied to some extent -mostly, they offered remixed material, brand new digital content in the form of simulations or educational resources such as schoolbooks.There were 7 repositories of the linked type (14%).Five were defined as aggregated (10%) and only one repository was catalogued as contributed (2%).A total of 8 projects (16%) were considered mixed.For example, in some cases the same platform offered content through direct links to other sites while also hosting exclusive resources.It is interesting to note that in at least one of these cases the repository offered resources of the contributed type.No clear pattern emerged between the type of repository and the software type used, or the origin of the initiative (public, private, etc.).
During the course of the data analysis and verification, four projects (8%) showed inconclusive information regarding the nature of their repositories, due to a lack of data or technical issues.

127
We identified what type of software system was used for each site by looking at the source code for main page and subpages (where content was located), and double-checking our findings through whatcms.org(Table 1).In spite of this careful look, we were unable to identify the system being used in six (12%) of the sites.The results point to the strong adoption of content management systems (CMS) with almost 40% of the sites making use of Joomla, Drupal or Wordpress.The overall majority (74%) of the initiatives originate within the public sector (Figure 2).Over the years there has been a substantial push for the development of government-led projects in many countries in Latin America.One such example is the Red Internacional Virtual de Educación (RIVED), which began in 1999 involving Brazil, Venezuela and Peru (Nascimento & Morgado, 2003) viii .
128 The data suggest that public repositories play a major role in how people conceptualize OER sites in Latin America.While the private sector is an integral and sometimes dominant part of the K-12 educational resource market in Latin America (Amiel, 2014;Fundación Karisma, 2014;Hoosen, 2012;Ortellado, 2009), this is far less evident here.Surprisingly, the participation of higher education institutions is minute, further supporting concerns regarding the gap between higher and basic education in OER.We can conclude that, at least based on this sample, in Latin America, OER is still very much a public initiative, promoted particularly at the federal level, which makes up nearly 4 out of every 5 public repositories (78%; Figure 3). 129

Licensing
Arguably, most users will reach a site through its home page.Intellectual Property Rights (IPR) information is most usually expected to be listed in a footer or a dedicated page containing information on terms of use or licensing.Almost half (44%) of the surveyed sites have a clear indication (not by omission) of "copyright" at this level (Figure 4).
130 The preponderance of the CC-BY-NC-SA license and of "all rights reserved" notices is aligned with the findings of a recent investigation of resource licenses in Brazilian repositories (Venturini, 2014).
The use of the term "copyright" might not strictly mean "all rights reserved".Some make use of the term, or the symbol "©", in order to indicate authorship and attribution.Nevertheless, we would contend that for many users, encountering terminology or symbolism that immediately draws attention to "rights reserved" is at least confusing, if not contradictory.
Finally, for those using a CC license, only 3 repositories possessed machine-readable code that was picked up by Open Attribute (http://openattribute.com), a plugin for web browsers that identifies CC-license data ix .
Out of the 250 resources analyzed individually, 116 (46,4%) showed some sort of misalignment regarding the way they presented their intellectual property rights to the end user.Mostly this happened due to differences between the general license portrayed by the main site and the license made clear on the resource itself.In a few cases, the misalignment had to do also with the metadata information provided.
These were rare episodes.Furthermore, the nonexistence of metadata information provided by 131 repositories in most of the investigated resources (174, nearly 70%) suggests that there is still much ground to be covered in how repositories make use of consistent metadata.
Nearly half (123) of the resources verified showed an apparent alignment between the general license provided by the repository, metadata pages (where available) and the licenses within the resources themselves (% 49,2%).But a closer look revealed that in 29 of these cases (11,6% of the total), the alignment is actually due to the complete lack of information on intellectual property rights from the repositories themselves, to their metadata pages, and the resources they hosted.In the case of repositories portraying CC licenses as their policy of choice, the majority (approximately 64%) showed some misalignment regarding their general licensing, metadata, and/or the resources themselves.
Among the operational repositories comprised by this study, there were 11 resources that were initially evaluated to be accessed/downloaded, but during verification were unavailable due to technical constraints (4,4%).

Discussion
The data presented by this study suggests a series of contradictions worth exploring, and should be of particular interest to those who create or maintain open repositories.These repositories were either suggested to us by experts and references, or presented themselves as examples of OER practice.
Considering, as we have alluded to earlier, that OER licenses are a cornerstone of the movement, our investigation would suggest that the great majority of these repositories would not easily be accepted as "open".An advocate of open licensing could discard most of these initiatives as flawed; but from the context of their implementation different interpretations can arise.
In order to discuss this issue one has to look at two aspects of the commons (Mizukami & Lemos, 2008).
Through open licensing, the legal commons has demonstrated a viable path for the legal exchange of cultural and educational works.The worldwide adoption of Creative Commons licenses has provided ample support for the interest in acting, legally, outside the boundaries of "all rights reserved".The social commons, on the other hand, emerges in "...spaces where intellectual property protection is either nonexistent, irrelevant or unenforceable" (Mizukami & Lemos, 2008, p. 47).Both practices can sustain relevant and thriving markets or exchanges.Though one can associate this only with piracy and illegal activities in poor nations, these peripheral practices are not limited to the context of Latin American countries.They are, nevertheless, particularly evident in educational scenarios in some of the countries we investigated.
As an example, in Brazil, the law that regulates author's rights presents a very limited number of exceptions for educational uses (Rossini, 2010).The lack of a reasonable set of exceptions can be a catalyst for classifying activities in the social commons as illegal, even when they are not We are off course not suggesting that those involved in the "open" movement currently limit themselves to pontificating about licenses.We are, as part of this movement, well aware that much more is done in terms of policy building and awareness raising than simply helping users choose the best Creative Commons license.
Open licensing is a path towards binding the social to the legal commons.All the while, sharing, copying and remixing continue without the establishment of regulation.In this paradox, we find it necessary to ask whether the emphasis put on disseminating and promoting open licensing does not overshadow the efforts that must be placed in discussing the actual practices that emerge from these social commons.
The repositories that are part of this study were considered examples of openness, and many of them clearly intend to be open.If open licenses are essential to OER, far more work must be done to support local actors in their efforts to adopt them integrally.This is crucial not only because of the user (human) centered aspect of searching for OER, but also to the engines and harvesters that scour the web to correctly identify works and their respective licenses.How can a search engine or an aggregator fully benefit from license metadata if there are conflicts in the information that exists or the code is poorly integrated?Adopting a license can be easy -creating a clear information policy is something substantially different.
This is of particular concern to those who are peripheral actors in the OER movement.If we are to provide a rich, diverse and equitable chance for OER production and dissemination, it is imperative that we support the "middle players", those who stand between those who have the resources and conditions to create (and sell services around) large-scale repositories, and those who create small, manageable collections of personal works (though we could argue, beyond the scope of this research, that here too, help might be needed).How can we help these players be a better part of the OER ecosystem?
The predominance of CMS and customized/original software, leads us to believe that these repositories might be built by leveraging local expertise on existing software, which in turn is adapted to institutional needs for a repository.Why these choices were made remains a question for further investigation.
Ultimately, if popular, customer-level, packages such as Wordpress and Joomla are the software of choice 133 for even some of the largest initiatives, there seems to be room for the development of guidelines to assist developers in better leveraging each CMS's resources in promoting openness (license choosers/plugins, machine-readable footnote/licenses, recommendations on resource metadata, among others).We contend that promoting awareness about technical aspects of openness in the selection, use and customization of software platforms should be considered a core aspect of the development of "openness", regardless of the project's size and scope.It would mean, to a large extent, that the promotion of technical and conceptual awareness about OER and its implementation should take in consideration the vastness of scenarios in which it takes place.More than "one size fits all" guidelines, there is great potential in sharing tools that could be adapted into the many different circumstances in which these repositories may be needed.
There are differences in orientation and recommendations to be made based on the classification of the repository.It is interesting that no clear pattern emerged in regards to the type of resources and origin of the initiative and the type of repository that is used.There are useful speculations for further study.One could contend that repositories which host content (exclusive, aggregated) would have different demands than linked repositories which just point to other repositories.This issue remains open for further research and reinforces the importance of guidance for software adoption as part of a structure that facilitates the openness of repositories.For example, those with mostly exclusive resources may have different demands and seemingly tend to use different software from those with linked resources.A more thorough mapping of these repositories regarding these aspects of their content and architecture could lead us to a classification of repository-types, with specific recommendations and suggestions on "openness" based on these criteria.Further analysis will allow us to consider repository profiles, and recommendations based on these specific profiles.
A large quantity of the suggested repositories had to be eliminated from the analysis (10).A number of resources were also unavailable during review (4,4%).Even though we did not have the intention to promote a deeper insight on repositories regarding technical and maintenance issues, it is interesting to note that technical consistency and availability may contribute to a lack of visibility of these repositories and uptake of the resources.
Clarity regarding the many practices, tools and resources interwoven in the creation of these repositories could lead to more robust collaboration and community building.There are, within the surveyed repositories, good lessons to be learned in regards to transparency and attempts to clearly communicate policy and site-rationale to the end user.But many gaps remain.The usage of consistent metadata is evident.Taking into consideration its many possible uses (user-oriented, machine-readable, and dataexchange), metadata quality can contribute greatly to furthering the goals of retrieval and collaboration.
In this, there is great opportunity for publicly funded repositories to demonstrate exemplary practice.
There is room, then, to provide OER practitioners with better guidance in selecting and customizing free and open software for repositories.Practical guides and recommendations on what decisions need to be made to promote openness and cohesion could prove useful.Efforts could also be placed in creating customized versions of popular free and open CMSs that could serve as templates or "shells", with built in 134 customizations that address some of the concerns presented here (license selectors, footnote texts, preloaded customizable "about" pages, and many others).

Conclusion
In this report we were able to present an overview of the investigation of 50 educational repositories in Latin America.We presented an "audit system" which we used to identify licensing practices, classify the initiatives and gather other relevant information.
In what can be an ironic twist of this study, we ponder whether too much emphasis is being placed on the adoption of licenses (and efforts in investigating their use) as a way to bring people and institutions into the legal commons.There is a way out of this conundrum.Below, we present some caveats.
We conducted a first-hand investigation of each of the sites, cataloguing a substantial amount of oftencontradictory data, making judgments along the way.Many micro decisions had to be made to categorize and present data.While all of the data was peer-checked, it was not vetted by the organizations mentioned in this paper; we hope that this can be accomplished in a further research.The data has been made available openly at http://dados.educacaoaberta.org(Figure 5) with the goal of correcting mistakes, revising categories and revisiting these results, which will be made available dynamically.The effort to gather this data, through navigation, downloads, and source code investigation speaks highly to the (not necessarily purposeful) lack of transparency that some of these repositories and sites exhibit.Information about educational resources, from cost to licenses, must be made as transparent and accessible as possible.It is our hope that this first analysis and the availability of the research data will help contribute to improving the visibility, openness and uptake of the resources made available through these valuable initiatives.The project is the result of a Hewlett Foundation grant aimed at the creation of one of three prototypes for a global map for Open Educational Resources (OER).

Notes
v.
The data used for the map prototype is made available at our open data portal: http://dados.educacaoaberta.org.We maintain the site in order to help promote the visibility of initiatives in Latin America, though the field moves rapidly and information may be outdated.For viii.
The portal was left out of the analysis due to persistent technical issues.
ix.Though we did not explore the issue of metadata availability more thoroughly, as a user-focused investigation, this speaks to a barrier to attribution, and points to an issue that can be easily resolved.
x.The often-mentioned example is that of "pequeno trecho".The Brazilian law provides exceptions for copies of "small sections" of works, but "small" is loosely defined, which has lead to widely different interpretations (and litigation) around the issue.
For their participation in, and support of, the MIRA project, we would like to thank Everton Zanella Alvarenga (Open Knowledge Foundation -Brazil), Priscila Gonsales (Instituto Educadigital) and Xavier Ochoa and Carlos Villavicencio (Escuela Superior Politécnica del Litoral).We would also like to thank the Hewlett Foundation for support in building the map prototype and all of the colleagues who helped us by answering the questionnaire.

Figure 2 .
Figure 2. Main origin of the initiative.

Figure 3 .
Figure 3. Main entity representing the initiative.

Figure 4 .
Figure 4. Licenses portrayed on main site or dedicated page.
x .Example of this in education include the endemic photocopying of textbooks in higher education, to the incorporation of digital resources downloaded from the web into teacher and student work in basic education, without any concern for intellectual property rights.The initiatives analyzed in this paper, for example, originated 132 mostly within the public sector, and as such, bring up inescapable legal and institutional questions regarding their licensing policies and practices.While some may argue that the ambiguity benefits the user, this does not work nicely for those in regions where fair use exceptions are limited, do not exist and/or might open one up to a lawsuit or loss of business.The popularization of Creative Commons licenses is a grassroots phenomenon.In workshops around the world organizations promote OER, and as with most such efforts, a large section of this discussion is focused on open licensing.In this we see an interesting paradox.The barriers to adoption of open licenses has been simplified to the point that simply attaching an image or a piece of code can be the proxy for a legal deed.Adding a Creative Commons symbol to a footnote on a website might be an act of empowerment but also may provide the illusion of openness.Could the ease of adoption of open licenses be overly simplifying the necessary efforts of what it entails to be "open"?

Figure 5 .
Figure 5. Open data portal with MIRA project data.
further developments on OER mapping please visit http://oerworldmap.orgvi.The survey consisted of five questions presented both in Portuguese and Spanish: 1) Name; 2) Country; 3) In your opinion, which are the most significant OER initiatives focused on basic education being developed in your country?(Name up to five); 4) Could you please suggest OER initiatives focused on basic education being currently developed in other Latin American countries?(Name up to five); 5) Is there anyone you would like to suggest to take this survey?(If so, please provide their contact information)vii.Due to the very limited time available for the production of the prototype, our initial plan of making the list and recommendations an open and live document was not attained.

Table 1
Systems Used for Repositories Additional add-ons and plugins were used in some of these sites to expand the functionality of the CMS, which is expected, given the modular nature of most CMSs.One out of five (20%) sites was identified as making use of original software.Surprisingly, only a very small group (4 sites, or 8%) makes use of what we would consider strict digital library software (DSpace and DigiTool).One would expect greater use of more structured digital library software, including less resource-intensive and open source packages such as Omeka.
Those who maintain and create repositories need greater support in making their open information policies clear.An incoherent communication strategy for open licensing makes way for ambiguities, which betray the ideals of transparency and clarity that open licenses came into being to help support.At the same time we must accept and recognize that the discussions around open licensing might not be a priority in contexts where regulation is oppressive.