Today the ideas we care about are increasingly pieced together on the web – bound not in the pages of books, but in the links between essays, articles, blog posts, tweets, and clips that we create every day. Links play a central role in how we understand information on the web, but surprisingly there is no central, public repository of them. Instead, a small group of companies – who profit from what we click – are responsible for finding links on the web, storing them privately, and deciding what to surface when we search and read.
These private companies now control how information is accessed on the web, giving them the power to influence what we see, how we feel, and even who we vote for. And this dangerous reality gets worse every day as the web explodes in size, and we become increasingly dependent on these companies to tell us what matters. It’s time to rethink how the web is connected, and to create a new kind of public library for linked knowledge.
One month before World War II ended, Vannevar Bush penned an essay in the Atlantic titled As We May Think, arguing that scientists should now turn their attention toward making all of our accumulated knowledge more accessible. He imagined a machine called the memex, which would allow us to connect pieces of information into knowledge trails that could be annotated and shared for deeper understanding.
“Wholly new forms of encyclopedias will appear, ready made with a mesh of associative trails running through them, ready to be dropped into the memex and there amplified. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. The patent attorney has on call the millions of issued patents, with familiar trails to every point of his client’s interest.”
Today we rely on this idea of linked knowledge for almost every piece of information we consume on the web. We rely on links to instantly give us context on what we read, to help us understand what we see, and to tell us what’s important. Links are a critical part of the web, not just for us, but for the apps – the search and recommendation engines – that we’ve come to depend on to surface information and guide us through the noise.
When we search Google, we’re not actually searching the web – we’re searching the part of the web that Google has been able to find and index. There is no central repository of webpages on the Internet, so Google and others have created massive armies of crawlers, computer programs that hop from link to link, scanning for new webpages and storing them in Google’s private database of the web.
As we type words into Google’s search box, Google looks in its private database and uses proprietary algorithms like PageRank to prioritize related webpages, returning the results that it believes are most important to us, or that we are most likely to click. Similarly, when you read an article on the web, you’ll frequently notice a “more on this topic,” or “recommended for you” widget on the side of the page. These widgets are often powered by companies like Taboola and Outbrain, who also crawl, scrape, and analyze links on the web, storing them in private databases, and surfacing them using proprietary algorithms. Similar to Google, these companies are paid by publishers every time we click on a headline they display.
Google and others depend on us to link content in articles, photo captions, tweets, and navigation bars to power their search and recommendation engines. But all of that accumulated information that we create and that Google captures is kept in private, only accessible via Google’s search box, and restricted to what Google decides we should see. Our dependance on the web grows every day, and this vicious cycle of information control has massive implications for how we see and understand the world around us.
We need to ensure that our information is freely accessible, and that the links between this information – this new format of linked knowledge – remains accessible, searchable, and explorable. Traditionally public libraries have filled this role in our society, acting as guardians who provide unbiased and unfettered access to knowledge. But the libraries of today were not built to house this new kind of live, linked knowledge, nor were they intended to be accessed by both humans and machines.
We need a new kind of public library – where linked knowledge is easily accessible not just to us, but to the next generation of knowledge-based applications.
Better links help everyone, and ideally such a library or the features that enable it would be supported by standards bodies like the W3C, built into modern web browsers like Mozilla’s Firefox, and added to search engines like Google. But there’s no indication that will happen anytime soon. And we’re not waiting around.
At Wayfinder we’ve spent the past few years thinking about how linking and knowledge exploration on the web could work. We’ve studied the original proposals for linking on the web, and built new tools that enable anyone to publicly link content they find in a simple way. What’s special about this is that now curious readers – not just authors – can easily map knowledge and have a say in how the web is connected, without the burden of writing yet another piece of content. And as these tools lower the barrier to connect information, they’re being used to voice opinions, share inspiration, and explain ideas in a handful of languages around the world.
But it’s not enough. We can’t fix linking just by giving more people the ability to create public links. We need to create a better infrastructure for housing, accessing, and exploring those links. An infrastructure that’s built not just for people, but for the next generation of education, news, and recommendation apps.
To do so we’re building a new open database and API – a public library – that developers can access to create, augment, and query for webpages and their related links. Imagine a world where a link placed in this essay referencing an Atlantic article would be immediately visible to Atlantic readers, and instantly added to a public database that the Atlantic and other apps could easily access. Such a design would reduce the ability for any one company to control what we see, and increase our ability to freely and deeply explore the things we want to know most about.
These ideas around connected knowledge, imagined for centuries, have never before been so possible or so crucial to build. If you’re a developer, we’d love your feedback on how this database and API should work.