The New Data

Mikeal Rogers
6 min readNov 18, 2020

IPLD: The data layer of a decentralized web.

What is data? The question is more philosophical than practical, but the definition we seem to be able to agree on is that data is a medium for expression.

You can express a lot with data, almost anything, but like a painting the meaning of that expression is subjective and depends on the context you have around it. A pollster publishes data they believe accurately captures the state of mind of a people, but to Nate Silver this data is only one point in a more complex answer to the same question.

We add meaning to data by altering its context. We link to and from pieces of data to accumulate greater context and therefor greater meaning. We have many means of linking data. A social network captures the expressions of many individuals and connects those expressions with others in a large relational database. The Web connects pages by way of URL links, either within the same site or between any site on the Web. Nate Silver collects data and links it into a complex model that weights different data points into probabilistic metrics.

Content Addressing

For decades now, researchers and engineers have been building on top of hash linked data structures. These are structures in which the link from one piece of data to the next is a cryptograph hash of the content being linked to. The first widely adopted use of these data structures was git and in the proceeding decade we’ve seen an explosion in blockchains and applications built on blockchains all relying on these same hash linked structures.

Hash linking has an interesting property, it allows you to create links between pieces of data without actually locating all of that data in a consistent location. Databases typically can’t do this, they capture the relationships between pieces of data using their locations on disc. URLs on the Web can link between sites, but those links contain the location of that site. With hash linking you can define a relationship to data you don’t even store, and anyone in the world can provide that data securely.

Since the link to a piece of data is the hash itself you can always compute the hash again to validate it’s a match. This is what allows trustless p2p networks to exchange hash linked data, which is how…

--

--

Mikeal Rogers

Accurate predictions about things that already happened.