Blockchain, Meet the Semantic Web

Published in

Fluree PBC

7 min readJul 6, 2020

What if machines could talk to one another?

In 1999, inventor of the World Wide Web Sir Tim Berners Lee expressed a vision for an intelligent, connected, and data-driven internet:

“I have a dream for the Web in which computers become capable of analyzing all the data on the Web — the content, links, and transactions between people and computers.”
Tim Berners Lee

Sir Tim Berners Lee’s vision became popular among emerging technologists who understood the value of universal data — interconnected, versatile, and directly available as a globally decentralized asset.

Core Philosophies of Web3 (The Semantic Web)

Data-first, data-driven — A data-first architectural approach to building applications and workflows. In a semantic ecosystem of multiple data sources, developers can build thin, lightweight application layers to consume and manipulate data in an open ecosystem.
Standards — Familiar languages, structures, and general frameworks are vital to the success in connecting the worlds’ data. Web Ontologies help universally describe categories (collections), RDF is an atomic format that can underly and describe any data type, and SPARQL is an RDF-consuming query language that was built to natively query multiple data sources.
Open but Secure — At Fluree, we call this “data defending itself.” The notion of opening up repositories of data must be met with a plan to secure information at the data source (and not defer security to APIs or re-build security at various data lakes). If we can secure data at the source, we can more freely open our databases to data consumers.
Machine Readable for Autonomous Machine<>Machine Communication — Imagine if the 500+ apps your business currently deploys could share data for better collective decision making? By expressing information in a globally-recognizable format (see: FAIR data principles), applications can contribute to and leverage trusted sets of shared data without relying on integration or massive data lakes.
Elimination of data silos — The power of universal data is strong — valuable insights, streamlined interoperability, and a broader lense of information for your apps to consume. Essentially, the semantic web will connect all “things”: data sources, peoples, and computers.
Trusted Data Sharing Across Borders — Applications should have permissioned read/write access to data sources shared across business units, companies, and industries.
Decentralization — A rejection of the centralized internet, the semantic web champions the notion of democratic participation in the web (Dapps, Cryptocurrencies, and empowered data ownership)
Empowered (Self-Sovereign) Data Ownership — Users will have clear visibility, ownership, and stake in the data/identity they produce and use, not some opaque version provided by a centralized authority.

What does the Web3 look like?

Web3 aligns with a data-first approach to building the future.

Secure Data — We’ll be able to build access controls as co-existent rules directly alongside our data

Democratic Data — Consumers will be able to have greater control over data and identity ownership

Trusted Data — The information we use to make decisions (and Autonomous technologies) will possess unprecedented tamper-resistance and verifiable provenance

Contextual Data — Humans and machines will be able to access a broader set of information to make well-informed decisions

Collaborative Data — Organizations will form data networks in consortium efforts towards greater industry interoperability

Baby Steps to Adoption

The Web3 vision never fully manifested itself as internet giants took control over and centralized the gold rush of data in the 2000s. It did, however, spark a fervorous following in standardization (RDF, SPARQL, OWL) with the goal of making all data on the web ‘machine-readable.’

This led to linked enterprise data initiatives with knowledge graph technology and connected data research around in healthcare, industry, and information science areas. Standardization also generally impacts the internet as meta-data related activities on the web are in RDF (serialized in XML) for describing data to machines such as search engines.

But the true semantic web — a universally decentralized platform in which all data were to be shared and reused across application, enterprise, and community boundaries — hasn’t taken complete form. And with emerging applications in machine learning and artificial intelligence, a semantic web of information readable by machines is the obvious next step.

So why haven’t we moved completely to a Web 3.0 framework?

Answer: Trust, at scale

The Semantic vision in many ways mirrors the rhetoric of today’s conversation around decentralization (most commonly as a defining mechanism and philosophy in crypto-economics). It lays out a powerful vision for cross-boundary collaboration, third-party transactional environments with no middlemen, and a ‘democratization’ of power. A true open-world concept.

However, in order to truly facilitate secure data collaboration across entities, the fundamental issue of trust became a massive hurdle. How are we expected to openly expose information to the world if it could easily be manipulated?

In the Tim Berner Lee “Layercake” image above, the large box dedicated to “crypto” wasn’t being truly filled in order to accomplish the “proof” and “trust” components that would bring the semantic dream to fruition.

Juxtaposed against massive companies centralizing data and beginning to use it as their own means to revenue generation, the semantic web stayed largely a vision with a very dedicated niche following.

Enter: Cryptography and Trust

In early 2009, Bitcoin introduced us to a very powerful concept: In Code We Trust. Via the combination of ordered cryptography and computational decentralization, Bitcoin showed the world that we could in fact inject trust into exposed information in an open transactional environment.

Immutability and tamper-resistance, provided by advanced cryptography, became the centerpiece of discussion around blockchain’s applications in various industries. And in many ways, it had the power to solve the trust gap that the web3 needed to close in order to move to mass adoption. A technology for securing, storing, and proving the provenance and integrity of information, combined with data standardization and semantic queries would mark an incredible step towards a more intelligent web framework.

| Additional reading: Why Blockchain Immutability Matters

Still, in the early days of blockchain technology, the application focus tended to land heavily on recording transactions. Specifically, public chains such as ethereum and bitcoin were excellent means of accomplishing asset movement between parties.

The machine-readable web requires more: in order to query and leverage data as a readable and malleable asset to power applications, blockchain had to manage all data in a usable format for applications. At Fluree, this is what we called “Blockchain’s Data Problem.”

Blockchain’s Data Problem

“We looked at blockchain a few years ago, but ultimately found it too complex for us to work into our business architecture.”

Blockchain was originally designed to facilitate peer to peer banking at scale — which required no or minimized data management capabilities. But when enterprises began their blockchain journeys, they found it difficult to retrofit the same first generation blockchain technology into their existing technology stacks — primarily because most enterprise applications require sophisticated data storage and retrieval. For example, a supply chain application produces and pulls data in a variety of ways: purchase orders, store IDs, RFID inputs, and more.

Building blockchain applications that handle metadata to this level of sophistication is challenging from a development perspective as well as an ongoing integration management standpoint — and the overhead required in aligning this information to make it operational downstream for applications is near impossible to justify.

So, most folks just stick their data and metadata in a regular old database and use blockchain as yet another data silo. The lack of holistic data management defeats the original purpose by adding cost and complexity to the systems that this new technology was meant to simplify.

Enter: Fluree

Fluree solves this data integration problem as a ground-up data management technology, allowing for developers to build a unified and immutable data set to power and scale their applications. No sticky integrations or extra layers — just one, queryable data store optimized for enterprise applications.

Fluree focuses on a blockchain-backed data management solution that brings cryptography and trust to every piece of data in your system.

By embracing semantic standards as a core component of storage (RDF) and retrieval (SPARQL, GraphQL), Fluree brings trust to the semantic web all under one scalable data management platform.

Fluree’s architecture is comprised of semantic standards, enterprise database capabilities, and blockchain characteristics that bring interoperability, trust, leverage, and freedom to data. It is truly the web3 data stack:

Learn more: http://flur.ee/why-fluree