How big are preprints?

Feb 11 · 2 min read

Those of you who know me know that I like preprint servers. In a past life, I used to manage a subscription journal that had a near-100% overlap with the ArXiv preprint server. This meant that readers could read the scientific content of the journal on ArXiv for free and that authors could also have access to our publishing services for free. Win-win, right?

Well, it’s not a perfect solution. Often, the peer reviewed version of an article is not available on ArXiv. (And it’s not financially sustainable for peer-reviewed journals to give all of their content and services away for free.)

So, the access problem in research communication remains without a solution, but preprint servers work remarkably well as a means of distribution and are indispensible to the communities that use them. In some areas of physics research, you might hear the mantra “If it’s not on ArXiv, it doesn’t exist”. With that in mind, you could easily believe that ArXiv is a lot bigger than it is.

Personally, I would like to see wider adoption of preprint servers. And, indeed, there are a lot more of them than there used to be, and they seem to be growing fast. But, there is a big mountain to climb.

This image shows registrations of CrossRef DOIs. CrossRef DOIs are nearly ubiquitous article identifiers that are used to identify peer-reviewed research articles. CrossRef DOIs do not give a perfect count of the world’s scientific output, so take this with a pinch of salt. Side-note: I find the dip in the 1939–49 period to be thought-provoking.

Current estimates put the total number of peer-reviewed research articles at around 100m. Growth is around 3.5–4m articles per annum and accelerating. ArXiv, the largest preprint server in the world, has published only 1.5m preprints and is currently putting out around 100k preprints per annum. ArXiv (and preprint servers generally) are accelerating too, but there’s still a lot of catching up to be done.

If you’re a research scientist and you aren’t yet using preprint servers, I couldn’t recommend them enough. They will increase the visibility of your work, establish priority for your discoveries and help you to get feedback. Take a look at the Wikipedia list of preprint servers for something suitable.

As a side note: I’d also like to see further development of services available to users of preprint servers and to that end, I manage 2 small services for ArXiv users:

  • a journal recommender [EDIT: currently deactivated pending upgrades] and
  • a version of Andrej Karpathy’s ArXiv Sanity Preserver which is adapted for the general-relativity community [EDIT: also deactivated — the above link will take you to the github page for my version of the app].
