Open access rates of a institution’s output vs a LIS Journal output — or are librarians walking the talk?

I was using a combination of oadoi + Openrefine described here to determine the % of free to read for my institutions.

Basically, I pulled a list of dois via Scopus, pulled them into openrefine and used openrefine to pull in results via’s API, parsing the JSON output with openrefine functions.

Being inspired by 1Science’s oafigr service that claims to help librarians with subscription decisions by telling them the amount that was already free to read, I also did the same for a few LIS Journals.

In particularly I chose more practitioner LIS journals like journal of academic librarianship, to see if librarians were “walking the talk” as they say in self archiving and promoting Green OA.

What did I find?

One institution’s output — an analysis

Some context, my institution — Singapore Management University was founded in 2000, and is currently the 3rd largest University in Singapore. The university serves about 10k students including post-graduates in 6 different disciplines namely Business, Accounting, Law, Economics, Information Systems and Social Sciences.

Based on Scopus data, I exported a total of 3,608 records from 1999–2017 (in two batches) of which 3,388 (93%)had dois. I then used the oadoi api to check for free versions on these dois.

Overall using oadoi API (by parsing the free_to_read field in the json output), I got a rate of 18.4% free to read over all papers in Scopus from 1999–2017. This could be an underestimate given that I think my Institutional repository isn’t well covered by BASE (which feeds into oadoi), for example I notice only 42 URLs found by oadoi which seems low.
free to read by year of publication

Eyeballing the yearly % figures, it seems the % free to read falls drastically from 2014 onwards, which is perhaps due to embargos?

OA colors by years

Overall my institution’s output that is available free to read is mostly via Green OA, as we are not STEM heavy. There’s a dip after 2013 for Green OA, further suggesting embargos is indeed at play.

Sources queried by oadoi

OAdoi allows you to drill down further to see which source was used by oadoi to find the free to read item under the “evidence” field.

For my output, you can see the extreme importance of BASE. As i remarked before on Twitter getting indexed in BASE/CORE OA aggregators will get increasingly important for Institutional repositories for discovery as more tools (e.g. discovery services) start using it as a source of free full text.

You can also analyse by “license”, free to read item is made available by.

free to read by license

You can also study the free to read articles by licenses. Of the 623 free to read articles unforunately I get a blank result for almost 90% of them. This either reflects a lack on oadoi’s part that it is unable to determine a license, or no license was posted.

Also shelving under “nice to have but it doesn’t work” is “version” and “reported_non_compliant_copies”. Both currently just show null values, but it would be of great value to know what version the free to read item is (preprint,postprint, final version) and whether it’s a legal copy. For more details of what can be extracted from oadoi API.

18.4% seems surprising low, but as I said it is probably an underestimate. Among the reason already mentioned, oadoi doesn’t try to find sources where the legality is unclear, this includes ResearchGate and which are among the largest single source of free to read articles.

Still how does this number hold up against journal titles? Particularly LIS journals? How much of that is free to read?

Why % of free to read for journal titles is important

One of the most long standing debates in the open access world is whether embargos are needed. Publishers of course claim embargos are needed to protect themselves otherwise librarians would start cancelling subscriptions due to availability of self archive versions.
Some open access advocates claim that librarians can never cancel subscriptions due to self archiving allowed by Green OA because this can happen only if Green OA reaches 100% for the title.

My view is this.

The “ hard to figure out” part s slowly changing with commercial services like 1science’s OAfigr but with the magic of oaDOI and openrefine you can figure out a similar statistic with some effort following the same steps as before.

1Science’s oaFigr service

The only difference is that you use the dois of the articles in the journal title you are interested in. One can again use Scopus or better yet crossref’s api(if title not indexed in Scopus) to do so.Once you have the list of dois, the same steps as above apply.

So far I have only tried to do this for 3 Journals, focusing on LIS related ones. What do you think I found?

What % of LIS journals do you expect to be free to read?

I’ve so far tried with only 2 LIS titles and these are the results

Journal of Business and Finance Librarianship (Taylor & Francis) — Out of 310 articles (1990–2017) with doi, only 11 articles were free (Green), that’s 3.5%.

Journal of Academic Librarianship (Elsevier) — Out of 1,398 articles with DOI (1993–1996, 2001–2017) with doi, only 112 articles were free (11 blue, 101 Green), that’s 8.0%.

For reference this is what Sherpa Romeo says about what is allowed with the journal of Academic librarianship (under Elsevier)

Sherpa Romeo entry for journal of academic librarianship
% of free to read for journal of academic librarianship by year (2001–2017)

I don’t see any particular patterns between years of publication and likelihood to be made free to read, except again there sees to be a dip in the more recent years of free to read. Again due to the embargo restrictions.

Further analysis by type of OA by year is not as interesting due to small numbers beyond saying about 10% were hybrid OA, the remaining Green OA. Similarly licenses attached to them basically maps the hybrid journals to CC-BY (2), CC-BY-NC-ND (7), CC-BY-NC-SA (2). I am not sure how accurate this is.

All but one tagged as Green OA has no licenses determined.

What to make of these results?

Firstly, I was kinda surprised by the relatively lower rate of free to read article for the LIS journals compared to my institution’s rate, though it’s of course only 2 journals.

Restricting just to articles published from 2000 onwards to make my institution output and Journal of Academic Librarianship comparable, the free to read article % is 18.4 % vs 8.64%

Of course things aren’t exactly comparable, because of the differences in disciplines (my institution is a mix of Economics/Business/Social Science/Law/Information Systems) but I was still expecting that librarians to be more aware of the possibility of self archiving for OA. (Gold OA is a relatively small (5%) portion of my institution’s output so can be disregarded)

The other thing to note is that the oaDOI API (unlike unpaywall) probably doesn’t find content when the legal status is unclear (e.g. Reasearchgate), but this difference is likely to skew towards non-librarians who are more likely to deposit such items.

I’m not sure what to make of this since it’s just one title, though it’s suggestive.

More study needed. Libraries who subscribe to OAfigr probably have more accurate and in-depth statistics of course.