The lost infrastructure of social media.

More than a decade ago, the earliest era of blogging provided a set of separate but related technologies that helped the nascent form thrive. Today, most have faded away and been forgotten, but new incarnations of these features could still be valuable.

Anil Dash
13 min readAug 10, 2016

As social networks grew in popularity and influence, the old decentralized blogosphere fell apart and those early services consolidated, leaving all the power in the hands of a few private companies. That’s left publishers and independent voices even more vulnerable to the control points of a few social networks and search engines.

Open Features

The core capabilities in the early era of blogging acted as open features for any site, and helped popularize social media itself, regardless of what site the content appeared on. But many of these open features have either disappeared or exist only in proprietary versions on closed platforms today, which means they only work between sites that use the same tools to publish.

Below is a table offering a quick survey of these features. Note: The companies and products listed in each column are illustrative and are not intended to be comprehensive; there were many competitive services in the early era of blogging, and the few services that still exist generally live on as zombie services with almost no users today.

These features are still valuable, which is why closed networks like Medium, Tumblr and WordPress.com offer analogous features such as the ability to follow other users. These sites also typically provide “reader” features for seeing updates from friends. In the table below, * indicates capabilities that are no longer available across sites but may be provided within closed networks.

What could these features do?

Publishing

The core capability of publishing articles to the web remains fairly mature and robust, and has seen new energy since the release of great writing tools like Medium. There’s not much to add here, but we still see the same general divergence between widely-available, simple tools that are focused on ease of use, and high-end powerful tools that offer serious publishers a lot of structure and tools for publishing.

Search

As extraordinary as it seems now, there was a point when one could search most of the blogs in the world and get a reasonably complete and up-to-date set of results in return. Technorati was a pioneering service here, and started by actually attempting to crawl all of the blogs on the Internet each time they updated; later this architecture evolved to require a “ping” (see Updates, below) each time a site updated. On the current internet, we can see relatively complete search results for hashtags or terms within Twitter or some other closed networks, but the closure of Google Blog Search in 2011 marked the end of “blog search” as a discrete product separate from general web search or news search. It’s easy to imagine that modern search software and vastly cheaper hardware make it possible to recreate a search engine for frequently-updated sites like news sites and blogs, with domain-specific features that general tools like Google News don’t offer.

Comments

In the early days of blogging, not every publishing tool supported comments natively; as a result, third-party commenting services popped up to meet the need. As the major tools incorporated their own commenting features, comment services came to be used primarily by big publishers using unwieldy content management systems that didn’t natively support commenting features. In the earlier era, comment systems were built without anticipating the ways that online communities would grow, and these serious design flaws enabled the widespread abuse that we see online today. Newer tools seem to be trying to put the genie back in the bottle, but large publishers are increasingly shutting down comments entirely rather than investing in building a healthy community.

Responses

One category of interaction between sites that’s nearly disappeared is the idea of structured responses between different authors or even different sites. Though Medium supports a limited version of this feature today, early tools like Trackback and Pingback made it possible for almost any site to let another site know that their story or article had inspired a response. Typically, those responses were shown under an article, similar to comments, but once Google introduced its advertising platforms like AdSense, links between sites suddenly had monetary value and spam links soon followed. A modern reinvention of Trackback-style features could connect conversations on different websites in the same way that @ replies work on Twitter.

Likes / Favorites

Though we think of liking, hearting, starring and favoriting things as one of the main actions on social media, these behaviors are a relatively recent phenomenon, only having risen in prominence over the last decade. As a result, they had no real “open” implementation that worked across different sites, and have only ever been built as features within closed networks. Favoriting or liking content remains a really important online social signal, so it’s easy to imagine that likes between different services could be connected, so a creator would know when someone had liked their work on another network or site.

Updates

A relatively technical feature of the open blogosphere was the ability for one site to let others know when it had been updated. This was important at the time because technologies for easily and efficiently crawling large numbers of sites were still relatively expensive. While the tech has gotten cheaper since then, the number of sites has increased massively, so this still remains a challenging area, and there may well still be value in a site being able to tell other services that it has new content that’s been updated. Indeed, a huge number of automated updates on services like Twitter are doing little more than sending such notifications; making a structured and trusted way to do so would still be worthwhile. Many blockchain enthusiasts also like the idea of connecting such update notifications to the blockchain to act as a trusted record of content being published, a capability that is genuinely new to this era of the Internet.

Identity

Many of the key features of interacting with social media sites, especially commenting, were originally built with no real support for an identity or login system, in keeping with the early web’s extreme preference for anonymity and privacy. Once it became clear that identity systems could allow for persistent identities, some forms of trust systems, and can in some cases help build more accountable communities, early login systems were created to let people sign in when commenting. Due to obtuse technical goals and desires not to infringe on user privacy, the user experience of these early services was very poor, and eventually they were generally replaced by simple, centralized systems like Facebook and Twitter sign-in. The unfortunate side effect of the poor user experience of those early sign-in systems was the entrenchment of dominant corporate identity systems, which enabled the mass surveillance of user behaviors by both the giant companies as well as governmental agencies. Given the abuses of those identities, it’s easy to imagine a modern incarnation of these sites using pseudonymous identities that aren’t tied to one’s high-visibility social networking profiles.

Friend Lists

Friend lists! It’s hard to imagine how important these were in the pre-social networking era of the web, but a simple blogroll linking to favorite sites or friends was a very effective way of both driving traffic and providing a bit of an identity for a site’s author. Initially, services offered a simple list of links, though this eventually evolved into a list sorted by the most-recently-updated sites. A further evolution came when specialized services like MyBlogLog introduced a list of people who had visited a site, complete with their avatars. These features together formed the precedent for the common lists on many of today’s social networks that show who you follow, and who follows you.

Following

The appearance of friend lists quickly suggested the ability for users to follow other users for updates. Though this was present from the earliest days of blogging on services like LiveJournal, the feature never truly made its way to publishing-oriented tools, and subsequent efforts to introduce open ways of following users on different sites have not taken off, remaining largely the domain of a small community that tinkers with technical standards. Of course, the closed-network model that LiveJournal pioneered remains a staple of today’s social networks, with following being a core feature on nearly every social platform. Following of course always implied something more than just getting delivery of another user’s content, and various efforts were made to codify and more specifically define the relationships between users who had followed or friended one another.

Syndication

At the same time as early social tools were developing the ability for humans to follow other humans, users were recognizing that they wanted their software and tools to be able to subscribe to updates from their favorite sites. This could be for the purposes of reading sites more easily (see Aggregation and Time Shifting & Reading, below), or for use in automation or analysis tools. Though there were various attempts at syndication formats in the early days of the web, RSS took off as the signature format for blogging, and a protracted standards battle (akin to AMP vs. Instant Articles today, but with a lot more vitriol) produced the Atom format, which basically did the same thing. Syndication formats were useful not just for being machine-readable but because producing content in these formats implied a certain set of reuse and transformation permissions that was useful for software creators. Today, Google (along with Twitter and other partners) is pushing AMP, Facebook is pushing Instant Articles, and Apple is advocating its Apple News format (which is based on RSS). Though these formats are being motivated by a desire to improve the mobile reading experience, their goal of strict machine readability and their suitability to the tasks of aggregation, time shifting and easy reading make them a direct analog to RSS and other formats that came before.

API

While a lot of content for the web is, and has been, written in web browsers, the unreliability of creating content in a browser has always driven a desire to create writing tools that connect to publishing systems. This desire for blogging apps has only increased with the move to mobile, given the increased likelihood of connectivity problems and the finicky nature of mobile browsers. In the earliest days of blogging, publishing systems supported APIs like Metaweblog and (later) the Atom API to allow apps to connect to their services and post or update content. These APIs were often limited, though, not exposing all of the features of a publishing platform, and making it difficult or impossible to manage other aspects of a site like embedding content, modifying design, or managing comments. Today’s newer publishing platforms typically have proprietary APIs with custom mobile client apps that offer robust features but with only limited feature access for third parties which want to publish or update content.

Metadata

It’s probably not surprising given that the web itself was born out of communities of researchers who wanted to share their work, but the early blogosphere was full of people who obsessed over “correctly” labeling and organizing content. A tremendous amount of energy went into encouraging adoption of various standards for metadata and content description, under initiatives like RDF, Dublin Core, and a broad set of efforts referred to collectively as the Semantic Web. Amongst these efforts, the only one to every really see significant traction was the Creative Commons project, which enabled the explicit tagging of rights permissions on a media object. As the early efforts faded away, a surprisingly effective and pragmatic wave of new metadata offerings took their place. First, during the peak of the SEO era, HTML5 and its related microdata tools enabled smart ways of marking up content on a page. Then, as social networks become more dominant, their formats like Open Graph (Facebook), Cards (Twitter), and Schema.org (Google+ and others) made it easy to add common metadata like author credits and publishing information. These metadata hints were used to make nice displays like the thumbnail images in a Twitter Card or a Pinterest Pin, and that attractive presentation drove widespread adoption of the formats. It’s striking that we likely have author information provided as metadata on the majority of articles published today, but almost none of our reading tools expose this information in useful ways, or let us search or explore using the metadata.

Discovery & Tagging

The flip side to metadata being provided in articles and stories is being able to discover content using that metadata. When Technorati was at its most effective, the simple text tags used to label articles (precursors to today’s social media hashtags, and similar in function to the article categories used on news sites) could be explored across various sites, so all articles about a topic were discoverable regardless of where they were published. Today, that capability exists in limited function on sites like Medium, WordPress.com and Tumblr, but it’s so hidden as to be almost invisible. The odd result of this is that we have trending topics in networks like Twitter and Facebook, where the vast majority of updates are short and trivial, but we don’t have easily-explorable tags, hashtags or trending topics for articles and stories that are longer and more substantive. It’s not hard to picture how different a trending topics list could look like if connected to major media sites and blogs, instead of just social network updates.

Analytics

Many of the key improvements that happened to today’s major analytics tools like Google Analytics or Chartbeat began with features designed for social media and blogging. Measure Map was a small but well-designed tool meant to provide analytics to bloggers; its team was acquired by Google and went on to remake much of Google’s Analytics service in the image of their product. Similarly, the once-independent FeedBurner service provided detailed readership analytics for people whose content was distributed via RSS, but it too was folded into Google’s infrastructure after an acquisition and the only similar features today are ones like Medium’s stats dashboard, which only shows activity within Medium’s network. It may well make sense for analytics features to be deeply integrated within a publishing platform, but it’s striking that most of today’s tools (with the notable exception of closed publishing platforms like LinkedIn) still don’t offer basic social insights on readers as was available from MyBlogLog and others a dozen years ago.

Advertising

It’s hard to believe, but much of the fundamental infrastructure of today’s online advertising economy was first beta-tested on blogs. Many of the earliest and most influential adopters of Google AdSense were independent blogs looking to make money on the web after the crash of the first wave of Internet advertising networks. Blog-centric display ad networks also thrived, first with relatively simple services like BlogAds, vertical networks like The Deck and BlogHer, and premium publishing networks like Federated Media. Over time, the online advertising ecosystem got more and more automated, and then evolved to incorporate first display ads and then video ads, and all of these changes disadvantaged small players. Entirely gone were the advertising and discovery platforms which prioritized readership over monetization, like web rings and other peer-to-peer tools. Today, Google’s domination of conventional web advertising is enormous, and mobile ad dollars are shifting rapidly to Facebook, which is trying to move both content and advertising entirely within its walls through the advance of Instant Articles. The overreach of intrusive and interruptive online advertising has hastened this transition as users increasingly adopt ad blockers to stop sites from slowing down their reading or exposing their data. Ironically, the simple text ads that blogs pioneered may come back into vogue in the new era of lightweight, mobile-optimized pages.

Aggregation

Just as Syndication and Following became core behaviors for early bloggers, so too did the ability to read updates from many different sites all in one place. My Yahoo was a pioneer of placing lots of headlines on one page, but early RSS readers like Bloglines and Userland Radio and, later, Google Reader, made for a unified reading experience that was as straightforward as checking email. It was very hard to tell whether this behavior would become popular—there just aren’t that many diehard news junkies, and their numbers were overrepresented in the early days of social media. But some parts of this habit have become commonplace as people discover the majority of the content they consume online through the update feeds of their social networks. Today, Facebook and others provide some of the functionality of aggregation, particularly when reading Instant Articles, but the behavior of subscribing to any site and reading it in an aggregator app has shrunken down to a tiny audience of legacy users.

Time Shifting & Reading

Closely related to Aggregation is the user behavior of Time Shifting & Reading. Given that, at the time, many sites were still experimenting with basic design concepts, the early web wasn’t always easy to read. Tools focused on reformatting articles for maximum readability, including the conveniently-named Readability and category-defining Instapaper, became popular with power users. These were slightly different than aggregator apps, because users would generally invoke them on an article-by-article basis, rather than subscribing to entire sites within them. Along the way, the apps provided both ad-blocking and time-shifting (offline caching) features aimed at further enhancing readability. It’s easy to see the influence of these tools on AMP and Instant Articles today, and it’s straightforward to imagine how this capability will be reborn using these modern formats.

The Lessons

Above all, it’s important to remember that there’s (almost) nothing new under the sun. Smart people have been thinking about how to publish on the Internet for as long as there’s been an Internet. Ideas come and go, sometimes disappearing because they’re not in fashion, other times because of the impact of the market or of a dominant player in the industry.

Ultimately, though, I think most of these ideas were good ideas the first time around and will remain good ideas in whatever modern incarnation revives them for a new generation. I have no doubt there’s a billion-dollar company waiting to be founded based on revisiting one of the concepts outlined here.

My hope is that those who are building tools today will see what’s come before and use it as inspiration to help give voice to people on the web in ways that are a bit more open-ended and a little less corporate-controlled than the platforms we have today.

--

--

Anil Dash

I help make @Glitch so you can make the internet. Trying to make tech more ethical & humane. (Also an advisor to Medium.) More: http://anildash.com/