Building the Verifiable Web

Eric Elliott
Po.et Blog
Published in
8 min readSep 20, 2018

--

The media business has a serious problem with trust. In 2017, only 24% of survey respondents showed a strong degree of trust in top ranking news media companies. (Source: Statista)

But the issue of trust goes much deeper than your confidence in the news you’re exposed to. The need for intermediaries to limit a creator’s exposure to counterparty risk also creates a serious drain on our ability to produce and distribute great content. There are too many middlemen sitting between parties in content transactions and each of them is taking a cut. Relying on trusted brands instead of total strangers helps reduce counterparty risk, but also introduces unnecessary and sometimes unjustifiable costs and time delays.

Take the US copyright registration process, for example. Registering a single creative work costs $35-$55, and takes about 3 months. Imagine if you had to do that for every tweet. For many kinds of otherwise valuable content, the process cost outweighs the benefit.

And once you’ve registered with the US copyright office, you’re only registered in the US. If you go to court in a WIPO treaty country, they’ll probably recognize your registration, but the process is still cumbersome and expensive, and there isn’t enough cross-border information sharing.

What we need is a fast, globally-recognized, decentralized ledger that can minimize counterparty risk without the added cost and overhead middlemen. That’s exactly the service that Bitcoin provides for financial transactions.

You can use the Bitcoin blockchain for a lot more than exchanging money. You could also use it to prove that you had access to some content before anybody else — which is essentially what the copyright registration system does.

The power of blockchains is that they’re decentralized. That same proof should work anywhere in the world because blockchain proofs are based on the laws of mathematics, rather than the laws of any individual country.

This solution alone could lead to breakthroughs in content value because content authors will feel more confident to bring their creations to market without fear of attribution theft.

Many emerging writers want to break into book publishing or film script writing, but they’re afraid to shop their manuscripts around because they’re afraid they’ll be stolen. In other words, the lack of decentralized trust in the creative industry creates a chilling effect that often prevents great content from ever seeing the light of day.

Similarly, journalists could use the same system to indelibly scribe source recordings to protect themselves against defamation claims, denials, and other roguish behavior.

You could also use indelibly scribed claims to build up a wealth of metadata about content: a nutrition label for the web.

Trust and the Verifiable Web

The Bitcoin blockchain works by collecting transactions into blocks, and then recording a fingerprint of those transactions in the block header. That fingerprint is called a hash.

In cryptography, a hash is a unique digital fingerprint of a specific piece of content. It can be used by two parties to ensure the identity or integrity of the content.

Let’s explore hashes in more depth. If you’re willing to install and use Node.js you can follow along and actually create some hashes on your computer.

Step 1: Install Node.js using the instructions on the Node website.

Step 2: Inside your terminal, install hasha:

npm install -g hasha-cli

Step 3: Create your first hash:

echo "Hello" | hasha -a sha256

This will create a cryptographically unique fingerprint for the text, “Hello” without revealing the contents of the text itself.

That will dump out the hash, but I find it hard to read because it doesn’t give us any spacing before or after the hash to make it more readable. Try inserting some line spacing with the bash “echo” command:

echo && echo "Hello" | hasha -a sha256 && echo -e "\n"

That should look something like this in your terminal:

Satoshis-Macbook-Pro:~ satoshi$ echo && echo "Hello" | hasha -a sha256 && echo -e "\n"66a045b452102c59d840ec097d59d9467e13a3f34f6494e539ffd32c1bb35f18Satoshis-Macbook-Pro:~ satoshi$

If you do it again, you’ll get the same hash:

Satoshis-Macbook-Pro:~ satoshi$ echo && echo "Hello" | hasha -a sha256 && echo -e "\n"66a045b452102c59d840ec097d59d9467e13a3f34f6494e539ffd32c1bb35f18Satoshis-Macbook-Pro:~ satoshi$

And the same again:

Satoshis-Macbook-Pro:~ satoshi$ echo && echo "Hello" | hasha -a sha256 && echo -e "\n"66a045b452102c59d840ec097d59d9467e13a3f34f6494e539ffd32c1bb35f18Satoshis-Macbook-Pro:~ satoshi$

In fact, no matter how many times you repeat the exercise, you’ll always get the same result. That property is called determinism.

A hashing algorithm is a pure function mapping from some specific set of inputs to a corresponding set of outputs. Given the same input, you’ll always get the same hash out as a result, but if we change the input even by a tiny bit, the whole hash changes:

Satoshis-Macbook-Pro:~ satoshi$ echo && echo "Hello." | hasha -a sha256 && echo -e "\n"a2c064616af4c66c576821616646bdfad5556a263b4b007847605118971f4389$ echo && echo "Hello?" | hasha -a sha256 && echo -e "\n"6450e89db2e4a624aa3886dd2f46c68f45abd9a28aab8f9dc6a5a5ff45252684

Every chunk of data has its own unique fingerprint. You can even hash an existing hash. This is called a hash chain, and we’ll look at why it’s important in a moment.

$ echo && echo "I'm a hash chain" | hasha -a sha256 | hasha -a sha256 | hasha -a sha256 && echo -e "\n"d608458a8174b9f3955491c17e7d81ba562580173249ddbd42db34e1c1a529bf

Note that we’ve piped the text through sha256 three times in the last example, creating a hash of a hash of a hash of the original text. Here’s what we get if we only hash it twice:

$ echo && echo "I'm a hash chain" | hasha -a sha256 | hasha -a sha256 && echo -e "\n"d73acf308539aed47458b75709f397439453351ce7589fdc51d64892ae1f5088

Hurray! You are Satoshi Nakamoto!

Well, not quite yet. Hashing a single input only gives us a single fingerprint — single point of reference. This is not a web, yet. Webs need links between data points. Bitcoin wouldn’t be spendable if it weren’t for the hash chain created between blocks. That’s where the word “blockchain” comes from. It refers to the hash chain that links each block to the one that came before in an indelibly deterministic sequence.

Po.et is building the verifiable web: the decentralized protocol suite for content attribution, discovery, monetization and reputation.

Po.et is a system of verifiable claims, which are JSON-LD documents that make a claim about a subject. The subjects and creators of claims each have their own digital identity in the Po.et ecosystem. Those identities are just hashes themselves — similar to the ones we’ve just created.

When you hash a creative work with Po.et, you’re creating its digital identity: the fingerprint that other claims about that work will refer to.

Later you’ll be able to add contributors to the work, add details about who owns it, license it for a distributor or another creator to use, and so on…

Each claim about a creative work adds to our shared knowledge about that work, and begins to form a web of information. Adding a contributor will link that contributor to the work, and you could follow that link to discover what else they’ve contributed to, similar to hyperlinks on the web.

The key difference is that on Po.et, links between claims are verified. On the web, if you link to the homepage of Bob, and later the domain name is sold to a different Bob, whatever you said about the original Bob in your link text may not be true of the new Bob.

On the verifiable web, subjects of claims are cryptographically linked forever, beginning with the creative work itself.

The first claim anybody can make about a creative work is the creative work claim itself, which embeds the hash of the work in the claim. That fingerprint ensures that our claims are immutably, permanently linked to a bit-for-bit perfect copy of the work we’re making the claim about.

In order to make a claim about a work, you must include the work’s hash. The hash is automatically compared to the original creative work file, and if the hashes don’t match, you can’t make the claim.

That prevents mischief like phishing attacks, where an attacker might ask you to write a review about some content, but then attach that review to a completely different piece of content in order to steal your influence and reputation to prop up their junk content. As the author of a claim, you’ll be able to verify with third-party software that you made the claim about the content you intended.

On the verifiable web, you can also make claims about claims. Imagine you’re a journalist and you write a report on a prominent public figure. They later deny the report by refuting your article with their own claim. You can refute their counter claim with a counter claim of your own, citing further evidence, and so on.

At each step along the way, the fingerprint of your original work is checked against the hash recorded in the creative work content claim. The hash of that claim is checked against the hash of the subject’s counter claim. The hash of the subject’s claim is checked against your counter claim, and so on. If any of those hash comparison fail, the claim being made would fail to verify and be rejected by the Po.et Network.

That system creates a verifiable claim hash chain, which is a sequence of hashes similar to the blockchain.

If the file hash fails to verify, the hash chain is broken, and all subsequent claims are rejected.

Of course, any number of claims can be made about the same creative work file, so the hash chain is actually a tree of hash chains that will each fail if the original file is not available to the network:

The creative work file is the trunk of the hash tree. Without it, all branches fail to verify.

This is why files must be always be accessible on the Po.et Network. They need to be available 24/7, forever, in order to ensure that the claims we make about creative work files are actually referring to the content we think we’re referring to. We accomplish that by employing a decentralized file system (IPFS).

The verifiable web eliminates the need for counterparty trust because we verify every claim and anchor every chain permanently to the Bitcoin blockchain. Nobody can make a claim and then later deny making the claim. Nobody can alter somebody else’s claims. Nobody can claim they authored a movie in 2020 if the script was already scribed on the Po.et Network by somebody else in 2018, and nobody can fool you into making claims about the wrong content.

If you’d like to get started writing content or software for the verifiable web, check out the existing integrations or the Frost API, which makes it easy to write your own custom integrations.

--

--