The Web was Never Decentralized

𝖉𝖜𝖍
Design Warp
Published in
14 min readJul 31, 2020

--

There are so many people today focused on “re-decentralizing the web.” They have a popular belief that when the web was invented it was a wonderfully optimistic vision of decentralization, governed by democratic principles and full of free information available through open access that all of humanity benefited from. They assume that originally all users of the web were well behaved and companies only wanted to help make the world better. Also, they think that somewhere along the line the web fell under the control of irresponsible corporations and governments and was contorted into the “broken and centralized” system that it is today.

The only problem with this narrative is that none of it is true. Zero. Zilch. Goose-egg. Nichts. Nada, なし。The web was never decentralized and the for-profit centralization and surveillance of the web started almost immediately after it was created because it was baked into the design. The web we have today is the logical manifestation of continually improving the original design. With almost no decentralized solutions for any of the nine problems of distributed systems, the global surveillance capitalism system was absolutely inevitable from the very first day that Tim Berners-Lee started writing code.

The web cannot be fixed. It cannot be “re-decentralized”. There’s not enough tweaking that can be done to make it more decentralized. The entire stack needs to be reinvented using fully decentralized solutions designed around the principles of user sovereignty to have any hope of making things better and if we did reinvent the stack, what would we have? I don’t think it would look like the web we know, but it would be better in every way.

If you’re curious to refresh your memory before the analysis begins, here is
the original proposal for the world wide web.

Diagram from the original World Wide Web proposal

Getting the Story Straight

I spent the day today grep’ing the internet for stories about the “re-decentralize the web” movement. There are many of them and nearly all of them repeat the popular lie about the original nature of the web. I found two exemplary articles to illustrate my point.

The first article is Brewster Kahle’s Decentralization: the next big step for the world wide web. Right away the author gets wrong the history of the web when he says:

In the early days of the world wide web, which came into existence in 1989, you connected directly with your friends through desktop computers that talked to each other. But from the early 2000s, with the advent of Web 2.0, we began to communicate with each other and share information through centralized services provided by big companies such as Google, Facebook, Microsoft and Amazon.

So, web centralization started in the 2000s and it was the fault of Google, Facebook, Microsoft and Amazon? Really? This is not true. The first wave of centralization of the web was promulgated by America Online (AOL) in the early 1990s. By the mid 90s, more than half of all internet users in America were AOL subscribers. AOL had an all encompassing “walled garden” that wasn’t separated from the dial-up and connect functionality. To connect to the internet, users were forced to run AOL’s version of it and for many users AOL was the internet.

The gate-keeping done by AOL drastically reduced the sovereignty of their users and it didn’t go unnoticed. The Wikipedia article on AOL mentions an article from the 1990’s that describes how customers were mad even back then:

There have been many complaints over rules that govern an AOL user’s conduct. Some users disagree with the TOS, citing the guidelines are too strict to follow coupled with the fact the TOS may change without users being made aware. A considerable cause for this was likely due to alleged censorship of user-generated content during the earlier years of growth for AOL.

Why do the complaints from the 1990s about “…the TOS may change without users being made aware.” and “…censorship of user-generated content…” sound so familiar? Oooh, that’s right, because the community guidelines and censorship rules keep changing on Twitter, Twitter again, Facebook, Facebook again, YouTube, and YouTube again!

Nothing has changed. AOL did all of the same things back in the 1990s that Facebook, Twitter, and YouTube do today. The web was never decentralized. Web users have always been subject to gate-keeping, censorship, and monetization by large tech companies.

Lana Kane

The second article that I found is Ruben Verborgh’s Re-decentralizing the Web, for good this time. Nice title. Reading it made me think of only one thing: Lana. Here’s why. The author doesn’t hesitate at all. The very first line is the lie: “Originally designed as a decentralized network, the Web has undergone a significant centralization in recent years.” Again, the web has always been centralized, the decentralized feel came from the fact that it was originally a centralized system that only seemed decentralized because it was inefficient.

Farther down the article the author sort of goes in the right direction when he states:

The concept of centralization does not pose a problem in and of itself: there are good reasons for bringing people and things together. The situation becomes problematic when we are robbed of our choice, deceived into thinking there is only one access gate to a space that, in reality, we collectively own.

On second thought, I disagree with a lot of this. I would argue that any distributed system without solid decentralized solutions opens itself up to corporate capture and centralization. The profit to be earned from that centralization is a very strong motivator making the eventual capture and the resulting robbery of our choice and subjugation to gate-keeping is inevitable.

Centralization will always put profits over user sovereignty. It’s built into the venture capital deals and corporate fiduciary responsibility. Profit will always trump users’ interests.

What’s Not Helping

Tim Berners-Lee (TBL) — the inventor of the web — wrote a post a few years ago outlining what he though were three major challenges to an open and free web:

  • We’ve lost control of our personal data.
  • It’s too easy for misinformation to spread on the web.
  • Political advertising online needs transparency and understanding.

I think the first point is an actual challenge, the other two are reflections of his own personal politics. What is important to point out is that these are all symptoms but not the problem. The problem lies with the fundamental design of the web and the web browser and the only way to fix it is to re-invent it from the ground up following the principles of user sovereignty to get a fully decentralized system for exchanging data on the internet.

What Went Wrong?

Pretty much everything went wrong from the start. To refresh everybody’s memory here are the nine problems of distributed systems:

  • Discovery
  • Introduction
  • Coherence
  • Public Services
  • Trust
  • Privacy
  • Coordination
  • Membership
  • Persistent State

Of these nine problems, only three — coherence, coordination and persistent state — were solved in an almost decentralized way. The other six weren’t even addressed. I’m not casting aspersions on TBL. It was impossible for him to be malicious when he and everybody else were just ignorant. Back then nobody had a coherent model and set of values for distributed system design. He was shooting from the hip and can be forgiven for the shortcomings in the proposal. The point is, the lack of decentralized solutions back in 1989 guaranteed the creation of the centralized surveillance capitalism web that we have today.

Let’s go through all of the problems and talk about how the web did or did not address them.

Discovery

Discovery is the process by which new users/nodes find and connect with other nodes to form a network or to join an existing one. In the case of the web, you had to know the domain name of the web server ahead of time. Back in the summer of 1992, I personally had a sheet of 8.5" x 11" lined paper with the domain names of every web server in the world. If I recall correctly, it was around thirty or forty servers. Shortly after that, the publications that tracked the BBS world began publishing directories of web server domain names and it all culminated in the publishing of annual “internet guides” containing reviews and highlights of places to go on the web. This is a fun read if you want to know how us old-timers did it.

The discovery problem was quickly solved by corporations creating centralized services. Back in 1994 we saw the rise of the first big search engines: WebCrawler, Lycos and Infoseek. In 1995 Yahoo!, AltaVista, and Excite. Google didn’t start until 1998. It was Infoseek, one of the very first search engines, that invented the business model around selling ad impressions on the web and mining the data of the searches. From its infancy, the web was not safe if you wanted to avoid surveillance.

Of course we all know the impact Google has on the web today. They weaponized what Infoseek invented and it was all possible due to the lack of a decentralization discovery solution from the start.

Introduction, Trust and Privacy

Introduction, Trust and Privacy weren’t addressed at all in the early web. It wasn’t until the Secure Socket Layer (SSL) protocol in 1995 and later the Transport Layer Security (TLS) protocol in 1999 that the web began using cryptography to solve these problems. The only issue with the solution is the top-down, centralized nature of the Certificate Authority (CA) system. Web site operators and users alike had to pay lots of money and undergo scrutiny to be issued an official certificate that web browsers recognized. This centralized rent seeking and gate keeping slowed the adoption of SSL/TLS and as a result the web wasn’t mostly encrypted until 2020. Why did it take so long and what changed to make it happen? The Let’s Encrypt Project happened. Started in 2012, the Let’s Encrypt Project began giving away TLS certificates for free. By 2020 they had given away more than one billion certificates and more than 90% of all web page loads use encryption.

However, all is not puppy dogs and rainbows. The centralized CA system has been abused by governments around the world to surreptitiously spy on web traffic that users thought was encrypted. By forcing state telecoms and other root certificate authorities to issue false certificates for domain names such as google.com or facebook.com, government surveillance authorities can trick browsers into thinking a connection to a web server is encrypted and private when there really is a government spy in the middle. I can’t imagine how many people have been fined or imprisoned in corrupt countries using this spying technique. The web’s lack of a decentralized solution for these problems puts some web users at risk, even today.

Coherence, Coordination and Persistent State

To it’s credit, the web does completely solve the coordination problem. The HTTP protocol does provide all of the capabilities to do all necessary communication within the system. The other two problems of coherence and persistent state are the ones I identified as being half-solved in the original web proposal. Clients that wish to reconnect to a web server (i.e. coherence) only need to know the domain name of the server and their computer is able to get the updated IP address of the server. The persistent state solution is the storage of content in HTML files on those server. The web’s storage model is decentralized. However, both solutions are only half-solutions because they both rely heavily on the centralized Domain Name System (DNS). The DNS system has been used by governments and abused by hackers to deny access to web servers and/or hijack them and take them over.

Public Services

In the case of the web, public services are web sites that offer publishing of content. The web design had no easy way for new users to publish content and in the beginning it was a daunting task. First you had to buy a domain name that wasn’t cheap. Then you had to set up a computer that would never crash and would stay running even when the power went out. Next you had to get a web server set up and then after all of that you had to hand write an HTML file for your web server to serve to web browsers.

The original web proposal assumed that the web would only ever be used by academics and business people with access to mainframes that were cared for like prized ponies. It never occurred to TBL that his vision of a “world-wide web” necessarily included all of the people in the world having access. Did he expect that everybody would be tied to a university or a business to be able to publish and contribute? No. He thought the web was mostly read only. He has even said as much on multiple occasions. His new work is around building a read/write web. Even he knows the lack of a decentralized public service for publishing content is a major design flaw.

This non-solution left a gaping hole so large that the economic opportunity for centralization by corporations created more billion dollar companies than anything else in history. The first company to tackle it was GeoCities. Back in 1995 they began offering a free space for users to create their own web page. By 1997, GeoCities began placing advertisements on users’ pages and despite the negative reaction, the site continued to grow. In 1999, GeoCities was acquired by Yahoo for $3.57 billion in stock and it was the third most visited site on the web.

Many other companies followed to fill in finer niches of the web publishing market. Blogging took off with the launch of LiveJournal and blogger.com in 1999. YouTube for videos in 2005. Twitter, in 2006, for voyeuristically watching celebrities, crushes and arguing with political enemies. SoundCloud for music in 2007. Github for open source software in 2008. The list goes on. All of these companies are, or have been, valued at more than a billion dollars.

Those companies are now centralized and entrenched castles of surveillance capitalism. They use the features of web servers and web browsers to track users everywhere they go and to monetize that data. In the case of Github being purchased by Microsoft, the centralization threatens the independence of large swaths of the open source software community.

Imagine if the web had been designed differently. What if the web browser had also been a web server? Just downloading it would make it simple to browse and publish. Would we have seen the rise of all of those companies? Probably. Why? Because the layers below the web — the domain name system, HTTP, etc — also had to be in place. But assuming there was a ubiquitous fabric for storing the published data and keeping it available, it would have only taken a web browser and server combination — the read/write web as TBL puts it — and all of those companies may have been entirely unnecessary.

Membership

The last major non-solution in the original design of the web deals with authentication and authorization. The membership problem is how users of a distributed system form relationships with other users, granting them a higher level of privileges. This can be an employee or admin accessing private company documents. It can be friends sharing photos privately. The membership problem assumes graduated privileges based on the authentication and authorization mechanism a system uses.

The web didn’t get authentication until 1997. It operated for nearly a decade without any way to differentiate users and delegate privileges. When authentication finally came, it was bound to single domain names and it was based on things users know (e.g. username and password). Because it was so intimately tied to the centralized domain name system, there was no way a user could be issued a credential that was portable across different web sites.

The solution didn’t really come until 2005 with the invention of the OpenID authentication protocol and later in 2009 with OAuth. Together they allow for cross-domain authentication, letting users log into sites using their Twitter, Google, or Facebook credentials. This too has many serious problems because of the foundation it is built upon. It is tied to the domain name system and to large social media platforms because it doesn’t use portable authentication credentials.

As a consequence, if you use your Twitter credentials to log into many different sites and then you say the wrong thing on Twitter and get banned, not only do you lose access to your data you uploaded to Twitter but also to your data that you uploaded to the websites you use your Twitter credentials with. The same goes for logging in with Facebook and Google credentials. The best solution for authentication and authorization on the web today is a centralized one that magnifies the censorship power of the large social media platforms. What could possibly go wrong?

It didn’t have to be like this. The decentralized identity community is building the new standards around portable, verifiable credentials. They have shown us that we can have credential “wallets” and use them across sites and remain fully in control of those credentials. With this new system for creating trusted connections and verifiable credentials users maintain their sovereignty.

Conclusion

In the light of experience hard won over three decades, it is now possible to see exactly what went wrong with the web and when. It was not centralization by big tech companies like Facebook and Google. It did not happen in the 2000’s. The design flaws were baked into the cake from day one. It didn’t become worse, the flaws became more obvious.

So?! How Do We Fix It? Where Do We Go From Here?

I know I like to joke about it’s time to burn down the social media platforms and break up the giant tech companies but that’s just cathartic ranting. The best news I have to give you is that we only have to ignore them and build the internet we want from the ground up. I reject the status quo. It is time to build a ubiquitous mix net that is pseudonymous and private and encrypted by default. It is time to build new user sovereign services that run on top of it. I no longer hate the social media platforms; I don’t care about them anymore. The opposite of love isn’t hate, it is indifference.

Do you still use Usenet groups? How about Gopher? When was the last time you chatted with a friend using ICQ or AIM? How about the last time you played a Flash game on the web? Technology evolves and obsolete technologies fade away. The interesting thing about Usenet and Gopher and ICQ and Flash is that their users liked those tools even when they switched away. They switched away because there were better alternatives, not because they hated them.

Tons of Facebook, Twitter, and YouTube users actually dislike — and many hate — them for how they treat their users. The arbitrary censorship and demonetization coupled with the creepy surveillance and emotional manipulation has created a very large and vocal group of anti-users. Many anti-users have moved on to alternative services such as Gab, Mastodon and Bitchute but that won’t bring down the system. Why? Because the alternatives are just copy cats. They aren’t convincingly better. They are constructed within the same constraints and bad architecture of the web that Facebook and Twitter and YouTube operate in. Users don’t believe Gab when they claim they won’t censor posts and spy on their users. Censorship and surveillance are built into the fabric of the web and therefore any alternative built using the web will also enable censorship and surveillance.

The only answer is to build a completely new stack from the ground up that is fully user sovereign and decentralized. Then, and only then, can we build information sharing and social applications that cannot be censored and cannot be used for surveillance capitalism and societal manipulation. Users will have the choice of what to see, hear, and read and what not to see, hear, and read. Freedom is messy. If you want to be free you will have to build your own fence, sweep your own sidewalk and work with your friends and neighbors to pick up the trash in your part of the internet.

--

--