This is the text of my WSI Distinguished Lecture, given at Southampton University on February 10 2016. It builds on my thinking over many years, going back to my essay Damn the Constitution in 2002. It also owes much to research and thinking done for a lunchtime lecture organised by Digital Repository of Ireland in Dublin in September 2013 and a keynote talk I gave at the UKSG conference in Harrogate, April 16 2014 which was subsequently published in the UKSG journal. It’s also indebted to long conversations and arguments and debate with Wendy Hall, Nigel Shadbolt and other luminaries of Web Science.
In the age of electronics an open society, one in which questions can be asked, where critical thinking is not just permitted but encouraged and where investigation rather than ideology is used to seek out the truth about the world — the open society according to Karl Popper — has also to be an open data society because reusable, structured data has become the main machine for doing the heavy lifting of moving knowledge around, just as books move ideas around.
The open Web is the most visible expression of that open data society, but it is increasingly undermined by the efforts of government on one side and commercial interests on the other, squeezing the public space occupied by civil society. Web science, grounded in the study of the open network, offers an opportunity both to study the impact of this shift and to propose countermeasures. In his talk Bill Thompson will argue that we can use the tools of Web science to design and build a better and more resilient Web — but that we must move quickly or there will be nothing left to save.
In Search of The Open Web
The open data movement is predicated on the view that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other legal or technological mechanisms of control. It is most succintly expressed in the Open Definition, which states that ‘a piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike’.
This is commonly interpreted around three separate axes. The first concerns availability and access: the data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the Internet. The data must also be available in a convenient and modifiable form. Second, it covers reuse and redistribution: the data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. The data must be machine-readable.
Finally, the Open Definition mandates universal participation: everyone must be able to use, reuse and redistribute — there should be no discrimination against fields of endeavour or against persons or groups. This means that ‘non-commercial’ restrictions that would prevent ‘commercial’ use, or restrictions of use for certain purposes (e.g. only in education), are not allowed.
The Web is a social machine, and can be the topic of scientific study to the point where we can formulate hypotheses, make and test predictions and refine theories which in turn allow us to explain, predict and influence the system. Like cognitive computing, the Web is a hybrid system with human and technical components, and the nature of their interaction changes over time — behaviours adapt, and technologies are developed, mature and are replaced. The ethnography of the Web, and a clear explanation of the emergent behaviours and properties that we observe, has become vitally important to any effort to understand society as a whole, since the Web is now a fundamental part of societal infrastructure.
The area of the Web that interests me most is the ‘open’ Web, the collection of resources accessible through URIs that are fully conformant with Web standards and W3C recommendations. An open Web is one that embodies the Open Definition in its working practices, one that is available and accessible, open to all participants, and which offers services that can be freely shared. It has become one of the facilitators of an open society and like other open institutions it is under threat as a bulwark of liberal humanism in a world increasingly shaped by theocracy and market fundamentalism.
Open data sits at the core of the open data society, and an open Web is the communications medium that is optimal for the creation and continued survival of such a society.
The term ‘open data society’ is my play on the formulation that philosopher Karl Popper originated in his book The Open Society and Its Enemies, written during the Second World War and first published in 1945. For Popper, an open society was not a description of a political system but rather an approach to what a society considered possible — an epistemological rather than political question.
In Popper’s view an open society is one that is open to challenge and open to different points of view instead of being grounded in unchallengeable authority, whether religiously derived or imposed by a political ideology. This view comes from his philosophical work and his own theory of knowledge, since if knowledge is provisional and fallible this implies that society must be open to alternative points of view because the ‘facts’ on which it appears to be based may themselves be found to be false. An open society allows cultural and religious pluralism; by contrast closed societies are grounded in claims to certainty and an imposition of a particular version of reality, where freedom of thought is dangerous and must be suppressed and only certain forms of intellectual exploration are permissible.
When he wrote The Open Society and its Enemies Popper believed that the social sciences had failed to grasp the significance and the nature of fascism and communism because they could not understand how those types of societies understood the world. He argued that totalitarianism forced knowledge to become political and that this made critical thinking impossible and led directly to the destruction of knowledge in totalitarian countries. In the book he criticised philosophers like Plato, Hegel and Marx who he thought had laid the framework for totalitarianism through their inability to understand the real significance of the freedom to challenge any proposition.
The Age of Electronics
We live in an age of electronics, where many aspects of daily life are shaped — for good or ill — by the capabilities of machines that rely on the flow and detection of tiny electric currents and the opening and closing of silicon-based switches. The things these technologies can do are truly astonishing, and their application has transformed the lives of us all — not just those who have easy access to the latest shiny toys but even those who live in poverty and may never themselves hold a mobile phone or computer or share information over the Internet.
As a result conversations around openness are closely linked to conversations about the Internet, not merely because the net has over the last thirty years been one of the principal channels through which ideas of openness have permeated the technology world and influenced politics and popular culture, but also because it is hard to imagine open data thinking having the impact it has had without a channel that provides easy access to that data and the results of its use, and the net is that channel.
The net is a channel, but the World Wide Web is the social machine that uses that channel to transform the forms of society that are available to us. Taken together, open data, the Internet and the Web have the potential to change the world. This happens because the types of knowledge that open data makes possible, which an open Internet makes shareable and which an open Web makes accessible support the sorts of open society that Popper was concerned with, so that any society that fully embraces the open data and open knowledge manifestoes would find it difficult to be closed in the Popperian sense.
That doesn’t mean it would be a good society, or nice to live there, or that it would not be evil. It just would be hard for it to be closed and remain closed.
We Can’t Rely on the Internet
An open data society is what happens when Karl Popper’s vision of the open society meets the Internet and the Web, although its emergence and success are far from guaranteed, not least because many players have an investment in closed data, closed networks and closed thinking.
The experiment that is the open data society started because of the largely unanticipated consequences of the global adoption of a set of technologies that were built around an assumption of openness without any real concern for the broader impact. Those technologies are the ones that have given us today’s network, and continue to develop.
Today’s Web is a vast, unregulated, worldwide experiment in openness, but the experiment does not come without risk. The push towards open data and the desire to build structures of scholarship, regulation and governance on top of the assumption that data will be open is one aspect, but it may well prove to be the most significant since it creates a real possibility that we will refactor modern society and find a way to build social structures on new set of assumptions, just as the Enlightenment replaced religious catechism with the results of scientific investigation in large parts of the world five hundred years ago.
That does not mean we can necessarily predict how the technology will develop. Popper argued against ‘historicism’, the idea that there was a flow to history and that there were core beliefs that could not be challenged. It is just as important to avoid technological essentialism, and accept that a programme or a dataset has no essential values and no essential qualities.
Asking ‘what is this data for?’ is, pace Wittgenstein, as useful as asking ‘what is a table?’ Open data is a tool through which political power can be exercised in various traditional and non-traditional ways, and at the same time it defines. a contested zone where politics is done; the open Web is a space for engagement, a new venue within which civil society can congregate, and like Tiananmen Square in 1994 or Tahir Square in 2011, it can permit forms of action which challenge authority.
It is clear that we cannot simply pull down the walls to the unimpeded flow of information and expect no consequences. No technology exists in a vacuum, and the growing use of powerful digital computers connected by an ever-faster and ever more pervasive network offering gateways to vast amounts of structured data requires us to ask hard questions about the ways they will be used to shape society.
The legal, regulatory, political and financial frameworks that define modern society are not necessarily amenable to the emergence of a working open data society. Openness is fragile, open data doubly so, and the open society always subject to challenge from those who would lock down APIs or impose rigid ideologies.
Those whose businesses rely on limiting people’s ability to copy and modify songs or images or video — the ‘content industries’ — find it hard to cope with openness, but so do those who want to manage the free flow of information for reasons that are not simply commercial, such as the doctors who keep our medical records or the companies storing personal emails, or those who make money by marshalling academic papers and selling subscriptions.
This is the conflict that lies at the heart of the open data society and creates enemies of the open Web. It is not a technological issue and will not be solved by technology. It is at its core an issue of epistemology, a question of how we know the world, a question that comes before we ask how that knowledge can be applied and used.
If we want to live in an open data society then we have to build it, and it we want an open Web then we have to build it, too. Which means that those advocating openness will inevitably end up taking sides in the conflict between those who believe the first part of Stewart Brand’s famous epithet
Information wants to be free
And those who prefer the less well-known second half
Information wants to be expensive
If we think about the open Web as an extension of liberal society to an IP-based world, we can begin to see why many actors are fundamentally suspicious of or opposed to the open Web. They include Government, commercial actors, rights holders, theocrats (of all faiths), censors and fascists (of all political persuasions).
All have reasons to limit online freedoms and to restrict creative and challenging uses of the Web. All seek to monitor and thereby control the flow of ideas, topics of debate, creative expression and political activity — all things which the open Web supports.
What do We Need from Web Science?
Science and politics are inextricably linked, but in general only the politicians are comfortable with this, using science to justify policy (and ignoring it when it is expedient to do so) without considering it.
But Web science can be applied to the creation and sustenance of an open Web, if those involved in the work choose to engage.
First, however, you have to take sides against the dangerous trend of turning the Internet into a zone of military engagement, where it has been weaponised, compromised and subverted by the governments, secret services, police forces and military agencies of all countries.
One tool to use here is of course the identification and labelling of the weapons, and here there may be scope for the Web equivalent of a ‘Department of War Studies’, considering how to observe and make public the military uses of the web.
On a more positive note, work needs to continue to define ‘goodness’ — to determine metrics that define a ‘healthy’ Web, and use them to monitor its health over time. At the same time we need to build defences, and explore the characteristics of the web that help preserve openness, determining which attributes are core and must be defended.
Underpinning this must be a willingness to shape policy and shape technology: if we are going to programme the social machine from within we need to know which levers to pull, but we also need a political programme to drive the technical programming.
Let’s start on the political theory of the open data society and the ways the open web can sustain it, and see how many social problems we can solve from that perspective — and try to do it without descending into the solutionism decried by Evgeny Morozov.
We will only have an open Web if we build and defend it against those who see in it challenges to their authority. They are right to fear it. We are right to build it.
 The Open Definition notes that ‘A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike’ . — See more at: http://opendefinition.org/
 See for example the analysis in Hafner & Lyon (1988)
 An API is an ‘application programming interface’, the way a computer program accesses a data set either locally or over the network.
 The earliest recorded occurrence of the expression was at the first Hackers Conference in 1984. Brand told Apple Computer co-founder Steve Wozniak:
On the one hand information wants to be expensive, because it’s so valuable. The right information in the right place just changes your life. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time. So you have these two fighting against each other.
 Or ‘controlled’. The two are equivalent in many online settings.