Rebooting Software




Complexity and cruft are the enemies of progress in software. As technologies and ecosystems mature, the original teams disperse to new projects or to found new companies. New people arrive as teams get larger and products lose the focus that made them initially so successful. So bad ideas, ones that the developers intended to fix or change, or that no longer serve the needs of the current context endure and thrive as “just the way things are”.

Paradigms freeze as new developers come to see the environment in which they learned as the “best” or “newest”, “latest” and no longer subject to revision or improvement. Ideas which once made sense in the context of a former time no longer fit the present, yet few challenge the assumptions which now form the baseline for entire ecosystems and ways of thinking.

It takes more work for new entrants to meet the aggregate baseline of features that all of us come to see as the minimum acceptable. So new products take a long time just to reinvent and duplicate the features they know their users will expect. Developers spend more time duplicating features from other apps and less time innovating. The relatively few success stories tend to be applications with narrow-focused functionality that do one new thing better than the alternatives. WhatsApp or Instagram are two of the better-known examples.

Global internet software, in particular, is built on a foundation of ideas and paradigms that limit our ability to move forward as the architectureal design decisions which gave rise to our current giants — Google, Apple, Facebook, Twitter, Linux et al. — are baked into the assumptions of almost everyone who codes anywhere. The success of a few large companies in building systems that scale are replicated across the industry since “Google and Facebook have the best engineers” so there’s no longer any point in lesser mortals looking for new approaches to old problems.

Further, in a networked world, the business scale advantages that acrue to the dominant players serve to entrench those players in their markets in ways that make it very difficult for any new software to take root in those markets. Competing with Facebook is more than just about building better software, it becomes of necessity, a battle to gain users at the expense of Facebook. It is the network of users itself that is the real value, not the particular implementation of application that Facebook uses to let us communicate with that network.

Without a critical mass of users you can’t compete, and the benefits that flow from a larger network are nonlinear. Software writers often refer to Metcalf’s Law as a concrete expression of this benefit, but this only captures part of the problem. The bigger, and more difficult problem, is simply that without lots of users and lots of data, many potential applications are useless. No users, no benefits. No benefits, no users.

So many ideas which would be excellent at scale can’t ever reach scale because the starting hurdles are simply too high for small teams, even well funded ones, to jump.

New paradigms come when some small group of people seek to make easy that which was once hard, or even possible that which was once not. What is hard now? Getting to scale. Competing with the existing networks that have Metcalf advantage. Building robust software that fits together with other robust software that other teams wrote. What if we made all of that easy?

We could do this starting with the data. Imagine a shared infrastructure service that was owned by the people who create that data and that can be accessed at scale by any software developers. Imagine if the data in Facebook, and the hardware and software infrastructure on which it ran, was owned by Facebook users and open for access by any developers who wanted to build on that data. And Facebook was no longer in a position to stand between user’s data and any uses of that data those users wished to permit.

This was the initial promise of the Facebook APIs, the initial promise of Twitter as infrastructure. A promise killed when it became obvious to the suits that more profit came from keeping the data private and from owning the entire “user experience”. The need for money drove decisions to make API access more and more restrictive, so third-party developers could no longer offer competing interfaces that rode atop the data that Facebook and Twitter came to see as their rightful property. This hamstrung the third-party apps business for Facebook and Twitter and it has never recovered.

Shared data infrastructure as a service would allow many small developers to join forces and build together a new open network which won’t ever shut out developers because some small group wants to monetize the users themselves. The very purpose of a shared data infrastructure service is to serve the users and therefore serve the developers who met the needs of those users, no matter when or where they arrived. More developers, more options, more potential benefits.

Unlike proprietary software companies like Facebook and Twitter, users won’t mind if thousands of new developers write software that can access their data should they choose to use that software. In fact, that’s what they’d prefer.

For this to work, the privacy and access restrictions will need to be part of the shared data infrastructure, so that users will be able to control what other users can see and the kinds of queries that other applications can run against their data. And since the users will own the data, they’ll control the ways in which others pay to access their data.

Want to show me some ads? Pay me the money I require or go somewhere else. Monetization and privacy schemes will be implemented at the level of the shared large-scale data infrastructure, not the application. So I won’t have to worry about different monetization and privacy settings for different applications running on the same data.

When users own their data and the hardware and software on which it is stored and accessed, they will control their futures and innovation will flourish.


While a shared data infrastructure would solve many problems for us users once a sufficient number of users have moved their data onto the infrastructure, it doesn’t help until such time as there is sufficient data on the shared infrastructure. So any shared data infrastructure solution also has a bootstrapping problem: not enough data, no benefits.

Still, since a shared data infrastructure is shared, developers who already have the user bootstrapping problem will find it easier to gain users since all the developers using the same infrastructure will be sharing the data and users. Each new social media application will bring new users to the collective data store and share the users that already exist in the store. By sharing the users and data, it will be easier for each individual developer to reach scale. For example, if 10 new social media applications might normally get 10,000 to 100,000 users in their first year, by combining forces, they might be able to get 300,000 to 1 million users since all the users will be able to piggybank on the combined total users brought by all of the developers in aggregate.

Another advantage for developers is that they won’t have to worry about the hard problems of scaling data infrastructure in a secure and performant system. This will allow them to focus on building application functionality which is easier to scale in isolation. Developers won’t have to worry about data scaling, replication, backup and other difficult computer science problems that come with large-scale data systems.

For users, the benefits are myriad. They won’t be tied to particular application vendors. They will be able to share in the monetization schemes. They will have control over the way their data is used and who can see it. They won’t lose their data if the applications they initially use end up fizzling or fading away. And, they will be able to switch to other applications which access the shared data infrastructure at any time should the original application no longer meet their needs.

While these benefits won’t be obvious to many people, for a significant core of technology saavy people privacy, data security, and the ability to share in the revenue streams generated by their own data will be compelling enough to warrant a look at solutions that use any such shared data infrastructure.

What about you? Would you find such a solution compelling as a user? As an application developer?