Designing a knowledge commons for academic publishing
Almost everyone thinks academic publishing needs to change. What would a better system look like? Economist Elinor Ostrom gave us design principles for an alternative — a knowledge commons, a sustainable approach to sharing research more freely. This approach exemplifies using economic principles to design a digital platform.
Why is this relevant right now?
The phrase ‘Napster Moment’ has been used to describe the current situation in academic publishing. Napster made illicitly downloading MP3s for free so easy that the music industry was forced to change it’s business model. Are research papers like MP3s? Will crusty academics furtively download papers from sketchy websites?
Many already do. In a recent Science Magazine reader poll, 85% of respondents thought pirating papers from illicit sources was morally acceptable, and about 25% said they did so weekly.
Elsevier — the largest for-profit academic publisher — is fighting back. They are pursuing the SciHub website through the courts. SciHub is the most popular website offering illegal downloads, and has virtually every paper ever published.
Institutions that fund research are pushing for change, fed up with a system where universities pay for research, but companies like Elsevier make a profit from it. Academic publishers charge universities about $10Bn a year, and make unusually large profits.
In the longer term, the fragmentation of research publishing may be unsustainable. Over a million papers are published every year, and research increasingly requires academics to understand multiple fields. New search tools are desperately needed, but they are impossible to build when papers are locked away behind barriers.
Is there a way to design a better system? How should papers be published? Who should pay the costs, and who should get the access? Economist and Nobel laureate Elinor Ostrom pioneered the idea of a knowledge commons to think about these questions.
What is a knowledge commons?
A commons is a system where social conventions and institutions govern how people contribute to and take from some shared resource. In a knowledge commons that resource is… knowledge.
You can think of knowledge, embodied in academic papers, as an economic resource just like bread, shoes or land. Clearly knowledge has some unique properties, but this assumption is a useful starting point.
When we are thinking about how to share a resource, Elinor Ostrom, in common with other economists, asks us to think about whether the underlying resource is ‘excludable’, or ‘rivalrous’.
If I bake a loaf of bread, I can easily keep it behind a shop counter until someone agrees to pay money in exchange for it — it is excludable. Conversely, if I build a road it will be time consuming and expensive for me to stop other people from using it without paying — it is non-excludable.
If I sell the bread to one person, I cannot sell the same loaf to another person — it is rivalrous. However, the number of cars using a road makes only a very small difference to the cost of providing it. Roads are non-rivalrous (at least until traffic jams take effect).
Most economists think markets (where money is used to buy and sell, top left in the grid) are a good systems for providing rivalrous, excludable private goods — bread, clothes, furniture etc. — perhaps with social security in the background to provide for those who cannot afford necessities.
But if a good is non-rivalrous, non-exclusionary, or both, things get a bit more complicated, and less effective. This is why roads are usually provided by a government rather than a market — though for profit toll roads do exist.
The well known ‘tragedy of the commons’ is a example of this logic playing out. The ‘tragedy of the commons’ thought experiment concerns a rivalrous, non-excludable natural resource — often the example given is a village with a common pasture land shared by everyone. Each villager has an incentive to graze as many sheep as they can on the shared pasture because then they will have nice fat sheep and plenty of milk. But if everyone behaves this way, unsustainably massive flocks of sheep will collectively eat all the grass and destroy the common pasture.
The benefit accrues to the individual villager, but the cost to the community as a whole. The classic economic solution is to put fences up and make the resource into an excludable, market-based system. Each villager gets an section of the common to own privately, which they can buy and sell as they choose.
Building and maintaining fences can be very expensive — if the resource is something like a fishing ground, it might even be impossible. The view that building a market is the only good solution has been distilled into an ideology, and, as is discussed later, that ideology lead to the existence of the commercial academic publishing industry. As the rest of this post will explain, building fences around knowledge has turned out to be very expensive.
Ostrom positioned herself directly against the ‘have to build a market’ point of view. She noticed that in the real world, many communities do successfully manage commons.
Ostrom’s Law: A resource arrangement that works in practice can work in
She developed a framework for thinking about social norms that allow effective resource management across a wide range of non-market systems, a much more nuanced approach than the stylised tragedy of the commons thought experiment. Her analysis calls for a more realistic model of the villagers, who might realise that the common is being overgrazed, call a meeting, and agree a rule how many sheep each person is allowed to graze. They are designing a social institution.
If this approach can be made to work, it saves the cost of maintaining the fences, but avoids the overgrazing that damages the common land.
The two by two grid above has the ‘commons’ as only one among four strategies. In reality, rivalry and excludability are questions of degree, and can be changed by making different design choices.
For this analysis, it’s useful to use the word ‘commons’ as a catchall for non-market solutions. Knowledge is non-rivalrous, and it’s excludability is a question of how you design the system — it’s a commons. How can we design social institutions so that we don’t need fences, and we don’t get a ‘tragedy of the commons’?
Ostrom and Hess published a book of essays, Understanding Knowledge as a Commons to look at exactly this problem.
The resulting infrastructure would likely be one or more web platforms, and, more importantly, social norms and institutions to support them. The design of these platforms will have to take into account the questions of incentives, rivalry and exclusion discussed above.
What would a knowledge commons look like?
Through extensive real world research, Ostrom and her Bloomington School derived a set of design principles for effectively sharing common resources:
- Define clear group boundaries.
- Match rules governing use of common goods to local needs and conditions.
- Ensure that those affected by the rules can participate in modifying the rules.
- Make sure the rule-making rights of community members are respected by outside authorities.
- Develop a system, carried out by community members, for monitoring members’ behaviour.
- Use graduated sanctions for rule violators.
- Provide accessible, low-cost means for dispute resolution.
- Build responsibility for governing the common resource in nested tiers from the lowest level up to the entire interconnected system.
These principles can help design a system where there is free access while preventing collapse from abusive treatment.
Principle 1 is already well addressed by the existence of universities, which give us a clear set of internationally comparable rules about who is officially a researcher in what area — doctorates, professorships etc. These hierarchies could also indicate who should participate in discussions about designing improvements to the knowledge commons, in accordance with 2 and 3. This is not to say that non-academic would be excluded, but that there is an existing structure which could help with decisions such as who is qualified to carry out peer review.
In a knowledge commons utopia, all the academic research ever conducted would be freely available on the web, along with all the related metadata — authors, dates, who references whom, citation counts etc. In reality, a much more gradual and piecemeal process is the best we can hope for.
This open dataset would allow innovations that could address many of the design principles above. In particular, in accordance with 5, it would allow for the design of systems measuring ‘demand’ and ‘supply’. Linguistic analysis of papers might start to shine a light on who really supplies ideas to the knowledge commons, by following the spread of ideas through the discourse. The linked paper describes how to discover who introduces a new concept into a discourse, and track when that concept is widely adopted. This could augment crude citation counts, helping identify those who provide a supply of new ideas to the commons.
What if we could find out what papers people are searching for, but not finding? Such data might proxy for ‘demand’ — telling researches where to focus their creative efforts.
Addressing principle 6, there is much room for automatically detecting low quality ‘me-too’ papers, or outright plagiarism. Or perhaps it would be appropriate to establish a system where new authors have to be sponsored by existing authors with a good track record — a system which the preprint site arXiv currently implements. (Over publication is interestingly similar to overgrazing of a common pasture, abusing the system for personal benefit at the cost of the group.)
Multidisciplinary researchers could benefit from new ways aggregating papers that do not rely on traditional journal based categories, visualisations of networks of papers might help us orient ourselves in new territory quicker.
All of these innovations, and many others that we cannot foresee, require a clean, easily accessible data set to work with.
These are not new ideas. IBM’s Watson is already ingesting huge amounts of medical research to deliver cancer diagnosis and generate new research questions. But the very fact that only companies with the resources IBM can get to this data confirms the point about the importance of the commons. Even then, they are only able to look at a fraction of the total corpus of research.
But is the knowledge commons feasible?
How, in practical terms, could a knowledge commons be built?
Since 1665, the year the Royal Society was founded, about 50 million research papers have been published. As a back of an envelope calculation, that’s about 150 terabytes of data — that would cost $4,500 a month to store on Amazon’s cloud servers. Obviously just storing the data is not enough — so is there a real world example of running this kind of operation?
Wikipedia stores a similar total amount of data (about 40 million pages). It also has functionality that supports about 10 edits to those pages every second, and is one of the 10 most popular sites on the web. Including all the staffing and servers, it costs about $5o million per year.
That is less than 5% of what the academic publishing industry charges every year. If the money that universities spend on access to journals was saved for a single year, it would be enough to fund an endowment that would make academic publishing free in perpetuity — a shocking thought.
What’s the situation at the moment?
Universities pay for the research that results in academic papers. Where papers are peer-reviewed, the reviewing is mostly done salaried university staff who don’t charge publishers for their time. Therefore, the cost of producing a paper to an academic publisher is, more or less, typesetting plus the admin.
Yet publishers charge what are generally seen as astronomical fees. An ongoing annual licenses to access a journal often costs many thousands of pounds. University libraries, which may have access to thousands of journals, pay millions each year in these fees. As a member of the public, you can download a paper for about $30 — and a single paper is often valueless without the network of papers it references. The result is an industry worth about $10bn a year, with profit margins that are often estimated at 40%. (Excellent detailed description here.)
I’ve heard stories of academics having articles published in journals their university does not have access to. They can write the paper, but their colleagues cannot subsequently read it — which is surely the opposite of publishing. There are many papers that I cannot access from my desk at the Royal College of Art, because the university has not purchased access. But RCA has an arrangement with UCL allowing me to use their system. So I have to go across town just to log onto the Internet via UCL’s wifi. This cannot make sense for anyone.
I’m not aware of any similar system. It’s a hybrid of public funding plus a market mechanism. Tax payers money is spent producing what looks like a classic public or commons good (knowledge embodied in papers), free to everyone, non-rivalry and non-exclusionary. That product is then handed over a to private company, for free, and the private company makes a profit by selling that product back to the organisation that produced it. Almost no one (except the publishers) believes this represents value for money.
Overall, in addition to being a drain on the public purse, the current system fragments papers and associated metadata behind meaningless artificial barriers.
How did it get like that?
Nancy Kranich, in her essay for the book Understanding Knowledge as a Commons, gives useful history. She highlights the Reagan era ideological belief (mentioned earlier) that the private sector is always more efficient, plus the short-term incentives of the one-time profit you get by selling your in house journal. That’s seems to be about the end of the story, although in another essay in the same book Peter Suber points out that many high level policy makers often do not know how the system works — which might also be a factor.
If we look to Ostrom’s design principles, we cannot be surprised at what has happened. Virtually all the principles (especially 4,7 and 8) are violated when you have a commons with a small number of politically powerful, for-profit institutions who rely on appropriating resources from that commons. It’s analogous to the way industrial fishing operations are able to continuously frustrate legislation designed to prevent ecological disaster in overstrained fishing grounds by lobbying governments.
What are the current efforts to change the situation?
In 2003 the Bethesda Statement on Open Access indicated the Howard Hughes Medical Institute and the Wellcome trust, which between them manage an endowment of about $40bn, wanted research funded by them to be published Open Access — and that they would cover the costs. This seems to have set the ball rolling, although the situation internationally is too complex to easily unravel.
Possibly, charities lead the way because they are free of the ideological commitments of governments, as described by Kranich, and less vulnerable to lobbying efforts by publishers.
Focusing on the UK, Since 2013, the Research Council (which disperses about £3bn to universities each year) has insisted that work that it funds should be published Open Access. The details, however, make this rule considerably weaker than you might expect. RCUK recognises two kinds of Open Access publishing.
With Gold Route publishing, a commercial publisher will make make the paper free to access online, and publish it under a creative commons licence that allows others to do whatever they like with it — as long as the original authors are credited. The commercial publisher will only do this if they are paid — rates vary but it can be up to £5000 per paper. RCUK has made a £16 million fund available to cover these costs.
Green Route publishing is a much weaker type of Open Access. The publisher grants the academics who produced the paper the right to “self archive” — ie make their paper available through their university’s website. It is covered by a creative commons license that allows other people to use if for any non-commercial purpose, as long as they credit the author. However there can be an embargo of up to three years before the academic is allowed to ‘self-publish’ their paper. There are also restrictions on what sites they can publish the paper on — for example they cannot publish it to a site that mimics a conventional journal. Whether sites such as Academic.edu are acceptable is currently the subject of debate.
Is it working?
In 1995, Forbes predicted that commercial academic publishers had a business model that was imminently about to be destroyed by the web. That makes sense, after all, the web was literally invented to share academic papers. Here we are, 21 years later, and academic publishers exist, and still have enormous valuations. Their shareholders clearly don’t think they are going anywhere.
Elsevier is running an effective operation to prevent innovation by purchasing competitors (mendeley.com) or threatening them with copyright actions (academia.edu and SciHub). Even if newly authored papers are published open access, the historical archive will remain locked away. However, there is change.
Research Council UK carried out an independent review in 2014 where nearly all universities were able to report publishing at least 45% of papers as open access (via green or gold routes) — though the report is at pains to point out that most universities don’t keep good records of how their papers are published, so this figure could be inaccurate.
In fact the UK is doing a reasonable job of pursuing open access, and globally things are slowly moving in the right direction. Research is increasingly reliant on pre-prints hosted on sites like ArXiv, rather than official Journals, which move too slowly.
Once a database of the 50 million academic papers is gathered in one place (which SciHub may soon achieve) it’s hard to see how the genie can be put back in the bottle.
If this is a ‘Napster moment’, the question is what happens next. Many people thought that MP3 sharing was going to be the end of the commercial music industry. Instead, Apple moved in and made a service so cheap and convenient that it displaced illicit file sharing. Possibly commercial publishers could try the same trick, though they show no signs of making access cheaper or more convenient.
Elinor Ostrom’s knowledge commons shows us that there a sustainable, and much preferable alternative. An alternative that opens the worlds knowledge to everyone with an Internet connection, and provides an open platform for innovations that can help us deal with the avalanche of academic papers published every year.