STI 2017 — Open Access Coherence Study in publications related to the Zika outbreak

Ricardo Barros Sampaio
Data Net Sci
Published in
7 min readFeb 24, 2018

Peter Krauss*, Jorge H. C. Fernandes** and Ricardo Barros Sampaio**

*ppkrauss@gmail.com

Research Unit (Lab or Department), Institution, Street Address, City, Postal Code (Country)

** jhcf@unb.br; rsampaio.br@gmail.com

Colaboratório de CTS, Gerencia Regional de Brasília, Oswaldo Cruz Foundation, Av L3 Norte, s/n, Campus Universitário Darcy Ribeiro, Gleba A, Brasília, 70910–900 (Brazil)

ABSTRACT

Scientific articles indexed in open databases, such as PubMed, have their free access granted. These databases are considered Open Access repositories, in which the act of registration assumes an open license. For those who expect to make full use of the document (read, understand, reuse and redistribute) the right of access expressed by the license may be insufficient. Ideally the same right of access would be observed to the attachments, figures and tables, as well as in the cited and referenced documents. The document depends on objects (its internal components and external documents it cites), and these objects have their own licenses. For the user of the document, the licenses of the dependencies should not present additional restrictions to the use: this is the principle on which the proposed concept of OpenCoherence rests, that is, the coherence of the license of a document with the licenses of its dependencies. The project, initiated by members of Open Knowledge Brazil in 2015, will now have its continuity in the scope of Open Science, with metrics and obtaining indexes in scientific articles marked with JATS (XML format for Journal Article Tag Suite). In this approach, public health content related to the outbreak of the Zika virus is evaluated.

INTRODUCTION

In the last decades, advocacy for a free model of knowledge, activism and initiatives for greater transparency in governments and in the management of scientific production, have been growing and consolidating. Scientific articles are typical examples of knowledge expression, where millions of these documents are maintained today, with legal deposit effect and without access barriers, in large digital repositories such as SciELO, PubMed Central and others. These repositories, called Open Access, guarantee the free access, authenticity, integrity and proof value of documents. Open Access is considered repositories where the act of registration already presupposes an open license, usually CC-BY.

For those who expect to make full use of a document belonging to one of these repositories, however, the right of access expressed by the license may be insufficient; The same right would have to be valid for its attached documents, figures and tables authored by third parties, as well as for the cited and referenced documents. We can name the cited documents and the internal objects, collectively, of document dependencies. Full use requires the preservation of the rights and expectations expressed by the license of the document where each of its dependencies presents a license of equal or greater degree of openness.

Internal dependencies, such as attachments and figures authored by third parties, may have their own licenses. Historically, digital documents have evolved from reproductions of paper to broader forms of content, incorporating the use of metadata. In recent decades there has been a tendency to formalize internal dependencies in metadata, making the credits and attributions more like licenses. This reality, still incipient, as observed by Mietchen et al. (2013) in PubMed Central, and can be seen in other repositories. In some contexts, such as Wikipedia, the licenses of the internal dependencies of an article are in fact explicit and systematically recorded.

Problem and related hypotheses

As a result of years of use of hypertextual resources and a culture of reuse, there is a tendency for scientific papers to present less autonomous content. This supposed reduction of autonomy would manifest itself in two ways:

• Internal dependencies: loss of authorial autonomy due to the reuse of third-party works and the need to make explicit this reuse through the license granted by the third party. Occasional occurrence of internal dependencies with degree of inferior opening to the document.

• in external dependencies: loss of semantic autonomy due to the tendency of documents to be based on precise citations, controlled by computational resources. Hypothesis that authors tend to be more focused and concise, and readers tend to make more intensive use of browsing through the links between related documents.

goal

The OpenCoherence project (Krauss 2015 and 2016) has been built as a mini-framework, software and data, for the audit of the repositories of scientific and legislative knowledge, and for the recording of evidence (documents taken as samples) that reinforce the hypothesis Of work. Within this context, the project aims to:

• Formalize opening metrics for the characterization of existing licenses

• To instrumentalize the characterization of the degree of opening of a document

• Characterize the internal and external dependencies of a document

• Formalize and instrumentalize the average opening set of document sets

• Characterize the degree of openness of large repositories by complete scanning or sampling

For the present proposal the scope of the discussion had as object the repositories and articles of scientific knowledge only, and not in the legislative one.

Definitions and methodology

Use license

Although there is a visible trend towards the standardization of open licenses, through initiatives such as Creative Commons, there is still a great diversity of licenses and ways of expressing licensing. Each institution, repository or even each author can be immersed in a different institutional context, and making their own choices. Different ways of expressing licensing:

  • by symbolization: the standardized symbol, highlighted at the beginning or end of the document, symbolically referring to the restrictions or permissions of use. Example: the symbol “© Crown copyright 2015” at the end of the legislative documents of England.
  • by link: a web-link for the license, usually anchored to its symbol or its abbreviation. Example: text “CC-BY 2.0” with link to “https://creativecommons.org/licenses/by/2.0/".
  • by attachment: the entire text of the license is added to the document. Example: Licenses attached to the Project Gutenberg books.
  • by DRM attachment: a machine-readable code (Digital Rights Management) is added to restrict access based on user permissions.

License Families

Despite the trend towards the standardization of licenses, there is still great diversity. Grouping similar licenses into “license families” helps reduce this diversity and simplify the process of interpreting different licenses.

As in linguistic and biological groupings, it was agreed in the project to choose among the members of each group a representative, said canonical. Each family is generally represented by a more popular license, and the family name obtained by simplifying their name:

• Family cc0: CC0-v1.0, DLDE-Zero-v2.0, ODC-PDDL-1.0, PDM

• CC-BY-v1, CC-BY-v2, CC-BY-v3, CC-BY-v4, CC-BY-v2, DLDE-BY-v2.0, GFDL-v1.3, ODC-BY-v1.0, OGL-UK-v1, OGL-UK-v2, OGL-UK-v3, OPL-v1.0

• CC-BY-SA-v2, CC-BY-SA-v3, CC-BY-SA-v4, OdbL-v1.0 family.

The copyright0 family corresponds to the licenses established by the clauses of the Treaty of Berne, using 1979 as a reference.

FIGURE 1: Type of licences and its oppenness

Degree of license opening

Families can be ranked in a ranking, from the least open to the most open, as shown in the figure. Each ranking level can be associated with a determinant amount of the degree of openness. This numerical determination is an arbitrary convention.

The characterization of the degree of openness of a document can be formalized with the support of the license table, through the following procedure:

• verify the license of the document, if not explicitly, obtain the corresponding implied license;

• check in the license table the corresponding family;

• obtain the “degree of openness” of the family table, according to the adopted degreeVersion.

Conclusion

The present work aimed to present this knowledge and methodologies already developed as a basis for the application in areas of science that need this framework to break a paradigm of the use of science by a caste of researchers.

By declaring an international public health emergency with the outbreak of the Zika virus, the WHO provided the emergence of a new moment for science in the field of health, accelerating the generation of knowledge to address an epidemic. However, this acceleration, even open in most cases (2017), should not only be contained in what is being presented, but in all the objects used in the research to make its results available, be the data, the methodology, the References and others.

As a result of this work, an analysis of the recent publications on the Zika virus and the level of openness of these works is underway, regardless of the magazines or opening proposals to which they were submitted.

References

ABL — Academia Brasileira de Letras (2009) “Vocabulário Ortográfico da Língua Portuguesa” (VOLP). Ed. Global, 5ª edição, 976 Pgs. ISBN:9788526013636.

Albuquerque PC, Castro MJC, Santos-Gandelman J, Oliveira AC, Peralta JM, et al. (2017) “Bibliometric Indicators of the Zika Outbreak”. PLOS Neglected Tropical Diseases 11(1): e0005132. urn:doi:10.1371/journal.pntd.0005132

Brasil (1975) “Decreto nº 75.699, de 6 de Maio de 1975” (Regulamentação da Convenção de Berna), http://www.lexml.gov.br/urn/urn:lex:br:federal:decreto:1975-05-06;75699

Brasil (1998) “Lei nº 9.610, de 19 de Fevereiro de 1998” (Lei dos Direitos Autorais — LDA), http://www.lexml.gov.br/urn/urn:lex:br:federal:lei:1998-02-19;9610

JATS4R Working Group (2015) “Improving the reusability of JATS”. In: Journal Article Tag Suite Conference (JATS-Con) Proceedings. http://www.ncbi.nlm.nih.gov/books/NBK279901/

Krauss, PP (2015), Repositório dos algoritmos das métricas de abertura OpenCoherence. https://github.com/ppKrauss/openCoherence

Krauss, PP (2016), “Coherence of openness in Open Access repositories: metrics and methodologies suggestion”, urn:doi:10.5281/zenodo.57253.

Mietchen D, Maloney C, Moskopp ND (2013), “Inconsistent XML as a Barrier to Reuse of Open Access Content”. In: Journal Article Tag Suite Conference (JATS-Con) Proceedings. http://www.ncbi.nlm.nih.gov/books/NBK159964/

[1] This work was supported by CIDACS / Fiocruz

--

--