This is how Google 2.0 might look like; Anyone can be a part of it, without competition.

Jubin Jose
Jun 27, 2020 · 8 min read

“Some people say, “Give the customers what they want.” But that’s not my approach. Our job is to figure out what they’re going to want before they do. I think Henry Ford once said, “If I’d asked customers what they wanted, they would have told me, ‘A faster horse!’” People don’t know what they want until you show it to them. That’s why I never rely on market research. Our task is to read things that are not yet on the page.” — Steve Jobs

One of the giants in digital advertising — Google (now Alphabet) is well known for its search service. They have started with a mission to organize the world’s information the best way possible. The web search engine was one of the initial offerings they have made and is still the successful one. For the past 20 years, they have played a major role in shaping the Internet. One such change I admire is continuous, sometimes monopoly efforts to transform unstructured web to some level of semi-structured form. The adoption into ‘Knowledge Graph’ and the introduction of was one of the smartest moves in this. These allowed them for near real-time, hassle-free information extraction from dynamic websites like e-commerce or live events. In addition to this, indirect forcing towards well-defined SEO rules, making influences in web development tools (Angular, Flutter) and practices, defining order through client-side monopoly (Chrome, Android) are some other genius steps they took. Even though these measures brought some level of structure into the Internet, something went wrong. Individuals and organizations started asking the questions — “Who owns my data? What do I get in return if it’s not me?”. People and organizations have started building walls. Sadly, the Internet now is getting fragmented, as it gets structured.

Competition versus Monopoly

When it comes to Google, everyone agrees that they are a monopoly in the search industry, even though the company is not positioned in a way to avoid it. They are delivering the best search service in the industry for free*. They are one of the entities that push AI innovation in aggression. They disrupted the telecommunication industry, powering the people in the lowest sectors in society. With YouTube, the whole media and publication industry got massively diversified. So, “Google is not evil” in contrary to many voices stay out there. But I would say, it is time to retire/rethink some, painfully one of the most profitable practices for the greater good of the Internet and technology. The sky is not a limit for Google, so why not?

Public, Closed and Private data

Google’s intelligent search mechanism powered by a well-organized knowledge graph (by closing once open but unsustainable system) solved this pain point and been serving quality service for the past decade. This unlocked a wide, much rewarding ecosystem under the growing monopoly of Google. Even though this system intentionally caused no harm, as an aftereffect of centralized and tight holding of data and related technologies, a large amount of useful but non entertained (under Google ecosystem) information doesn’t make its way into it. Let’s call it (partially / fully) closed data.

The generators of closed data are mainly organizations. We nowadays call this data — the big data staying frozen inside organizations. If you are from the IT industry, every product/service has an offering nowadays to address this closed behavior or organization data. Even though this is a common practice and an opportunity for product developers from a business perspective, the organizations mostly don’t get to taste the best technologies being offered to them (in most cases, it’s at the price of data privacy). It’s worth noting that, companies including Google are investing in homomorphic encryption to address this problem to provide a unified service across all customers. Still, the opportunity limited by the centralized behavior of service and gate-keeping is a big problem when it comes to the accessibility of the data across multiple vendors. Even with these encrypted computations, vendor locking will persist by moving to higher levels and a boundary will get redrawn around the organizations. Many organizations will possibly move out data with much-relaxed restrictions for the public good if it were free out of these boundaries.

Private data should stay private. The user (person/organization) should have control over it whatever the conditions are. This works best when the computation is moved to the user. As of now, it’s not fully practical and stays as an open problem. One thing for sure, this requires methods to establish the controlled flow of data and computation beyond central boundaries.

Open-source software and Non-profit organizations

This same thing can happen to data as well if we build a sustainable system. As we saw in open-source software, the data producers and consumers can come down build a self-sustainable marketplace. Open-source software gave birth to cloud computing services. And open data could give birth to storage and analytics services that anyone can offer. Everyone adds more value to the system; data generators, data keepers, and data analysts — data can be forked by generating derivatives (insights) to bring more value.

There are non-profit organizations like Wikipedia already working on this. And as we have seen the derivative — Wiki-data has added more value to the overall system. Also, developers and companies are using Wiki-data to generate more information and value is keeping it closed. In most cases, the reason is that there’s no mechanism to open it up without any incentives. Because the value gets locked up in the pipeline, non-profit organizations like Wikipedia are always on the verge of existential threat. And it is a known fact that the companies that generate more value (Google, Amazon Alexa) give very less in return to Wikipedia which is mainly funded by the generous contributions from the general public. This model is neither sustainable nor progressive. What we need is a self-sustainable data market.

Google 2.0

Data as a foundation layer

  • uniform across networks (ontology should be the same)
  • structured
  • both machine and human-readable
  • distributed (edge storage and processing)

Numerous efforts are happening in this domain already. IPFS is already setting up a decentralized data storage with incentives to the participants. Underlay is working on a distributed knowledge graph over the IPFS network, BigchainDB is working on the BFT NoSQL database over IPFS, Aquila Network is working on semantic search indexing to be integrated with IPFS and so on.

Data marketplace

Data ownership is another important feature of data markets. When data got introduced to the market, the owner who created it will get credited during its lifetime. Data protection makes sure only the authorized parties can access to it even though the storage and networking are made possible by trustless nodes.

As we have discussed previously, innovations in homomorphic hashing will multiply the possibilities of privacy-first computation and distribution of data without trusting anyone.

Decentralized collective intelligence

a-mma (a_മ്മ) is a non profit organization with focus on long term development of swarm intelligence and related technologies. a-mma gives incubation & community support to commercial/non commercial projects in this field of interest and doesn’t own them.

Jubin Jose is one of the early members of a-mma, still helping it to reach a sustainable point of independent decentralized operation.


Scalable Machine Learning for everyone