Tredd: Trustless Escrow for Digital Data

So you have some content that you want to trade for payment on the Internet. What could go wrong?

Plenty. But in this article we’ll discuss technological fixes for many of those things and present Tredd, a working prototype of a system that implements them.¹

Atomic swaps

Here’s a thing that can go wrong: If you send the content first, the buyer might keep it without ever sending your payment. Or they might send the wrong amount, or pay in the wrong currency, or just take forever to pay.

On the other hand, if you require payment first, you may have difficulty convincing buyers that you can be trusted to send the content, or to send the right content, or to send it in a timely fashion. It might not be that you’re a crook. You may simply have technical problems delivering what you promised, and then there goes your reputation.

You might choose to employ an escrow agent: a trusted intermediary who receives your content and the buyer’s payment, releasing each to the other when satisfied that both parties have met their obligations. But perhaps you don’t want to place your trust in a third party, or pay their fee or deal with their latency (the time it takes them to complete a transaction).

It would be nice if an escrowed exchange could be trustless and automated. You would somehow send your content, and the buyer send their payment, at the same moment, so that you can’t fail to deliver the content if you receive the payment, and they can’t fail to pay if they receive the content. And this is possible — using a blockchain. We’ll get into details later about exactly how, but for now it’s enough to say that blockchains allow two parties to make a simultaneous exchange, also called an atomic swap.

Encryption

Using a blockchain doesn’t solve every problem, and it introduces a couple of new ones. It doesn’t guarantee that the payment you receive will be the right payment, and it doesn’t guarantee that the content the buyer receives will be the right content. It also requires that you place your content in the blockchain transaction where the atomic swap happens, which means not only the buyer but everyone observing the blockchain will get a copy of it! Furthermore, blockchain transactions must generally be short. They are not the place for transmitting documents of more than a few hundred bytes.

That’s OK — we can fix those problems. Let’s say that you send your content to the buyer first, without using a blockchain — but you send it encrypted. After they receive the encrypted content, you and the buyer use a blockchain to atomically swap payment for the decryption key. The key is short, so you’re not overloading a blockchain transaction with too much data. And blockchain observers who see the key in your atomic-swap transaction learn nothing about the content you’ve sent to the buyer separately.

Refunds²

This leaves the problem of both parties being sure of what they’ll receive when the atomic swap happens. You want to be sure not to reveal the decryption key if the payment is wrong. The buyer wants to be sure not to send payment if the content is wrong.

How can the buyer know if the content is wrong before getting the decryption key and deciphering the content? In a nutshell, they can’t. The next best thing for the buyer is to have some way to get a refund if discovering, after decryption, that the content is wrong.

This means that the atomic swap transaction needs to include a built-in delay before you (the seller) can collect your payment. The delay gives the buyer a chance to prove that you sent the wrong content, which in turn should allow them to collect a refund. If the delay passes without such proof, the payment is yours to keep.

Hashing

How might the buyer prove that the content they received (or possibly the decryption key) is “wrong”? You and the buyer would first have to agree on what constitutes “right.” This is where hashing comes in handy.

Hashing is the process of condensing some data of arbitrary length down to a single number in a given range. If the range is large enough, and the hashing algorithm is good enough (there are several possible ones to choose from), then the likelihood of a hash collision — two different inputs producing the same hash value — is so low that the hash value for any given input can be considered totally unique. Importantly, a good hashing algorithm is a one-way function — given some data, it’s easy to compute its hash, but given only a hash it’s effectively impossible to figure out what data it’s the hash of.

When you send the encrypted copy of your data to the buyer, you can also send along the hash that the decrypted copy of the data should have. Better still, the buyer can request the data from you by its hash value: “Give me the file whose hash is H.”³ Your response, containing the encrypted copy of the data, is an implicit promise that when decrypted, the data will have the proper hash.

Smart contracts

When it’s time for the atomic swap, the buyer can use a smart contract to attach a condition to their payment:

If I can prove before time T that the decrypted data does not produce hash H, I get a refund.

They construct a partial transaction and send it to you. You review it, making sure it includes the right payment, the right H, and an acceptable value for T. If it all looks good, you add your decryption key K and publish the transaction to the blockchain.

The partial transaction can be designed so that it’s invalid without your addition of key K, making it safe for the buyer to share it with you. You can’t run off with their payment. It also allows you to avoid revealing the decryption key if the payment is wrong — just throw away the partial transaction instead of completing and publishing it.

So you publish the transaction and the buyer gets the decryption key (by observing the blockchain and seeing the transaction appear there). In the happy case, they decrypt the content and verify that it produces hash H, and you keep their payment. But suppose it doesn’t produce hash H — how do they prove it?

They can’t just present the decrypted data and say, “Look, it has the wrong hash.” That could be any data with the wrong hash, not necessarily what you sent. They might just be scamming, trying to get their payment back. Similarly, they can’t just present the encrypted data.

From the point of view of the smart contract (which decides whether a refund is warranted) there has to be a way to connect the data you sent with the data they claim is wrong. Fortunately this is easy to do. The buyer just changes the condition in the smart contract from:

If I can prove before time T that the decrypted data does not produce hash H, I get a refund.

to:

If I can prove before time T that the encrypted data whose hash is E produces a hash other than H when decrypted with key K, I get a refund.

In other words, they include a hash of the encrypted data in the partial atomic-swap transaction they send you. Before you reveal K (by completing and publishing the transaction), you check that E matches the encrypted data you sent, in addition to checking the payment and the values T and H.

So you publish the transaction, revealing K and implicitly agreeing to E, T, and H. The buyer discovers the decrypted data does not produce hash H. To claim their refund they would now have to present the data you sent; show that its hash is E; and show that the result of decrypting it with K gives data whose hash is not H.

But this returns us to the problem of blockchain transactions being unsuitable for the inclusion of large data files. If the buyer needs to create a refund-claiming transaction showing a two-megabyte file has hash E, they simply can’t.

We need a way to limit the amount of data required by a refund proof. How can we do that?

Let’s start by dividing your data file into chunks of no more than, say, a kilobyte each. You send the data file to the buyer as a series of N encrypted chunks, C₁ through Cɴ. To claim a refund, the buyer needs only to show one chunk that decrypts wrong, not the entire file.

Now, however, each individual chunk, both encrypted and decrypted, needs its own hash if it’s going to be possible to show that:

  • it’s one of the chunks you sent, and
  • its hash isn’t the one that’s expected.

That means that instead of including a single E and a single H in the smart contract of the atomic-swap transaction, the buyer has to include E₁, E₂, …, Eɴ (one for each chunk) and H₁, H₂, …, Hɴ. The refund-claiming transaction could then present some encrypted chunk Cᵢ, show that its hash is Eᵢ, and show that decrypting it and then hashing does not produce Hᵢ.

Unfortunately we are once again confronted by the severe limit on blockchain transaction sizes. Even though hashes are smaller than chunks — typically 32 bytes each — this is still prohibitively many bytes for even modest-sized files.

Merkle hash trees

Luckily we still have a couple of tricks up our sleeves, and the first one is called a Merkle hash tree. This is a data structure with two useful properties:

  1. It condenses a sequence (of any length) of strings (of any length) down to a single hash, called the root hash; and
  2. It allows the creation of a compact proof that a given string is part of the sequence.

How compact is a Merkle hash tree proof? Its size grows logarithmically with the number of items in the tree. This is quite good. The proof — a sequence of entries — grows by just one entry every time the number of items in the tree doubles. If you multiply the number of items in the tree by a thousand, you add only ten entries to the proof.

Now it’s possible to change the smart-contract condition from:

If I can prove before time T that the encrypted data whose hash is E produces a hash other than H when decrypted with key K, I get a refund.

to:

If I can prove before time T that:
• the encrypted chunk Cᵢ has a hash that is part of a Merkle tree with root hash E;
• when Cᵢ is decrypted with K it produces a hash other than Hᵢ
then I get a refund.

With this formulation it’s no longer necessary to list E₁, E₂, …, Eɴ in the smart contract. But it is still necessary to list H₁, H₂, …, Hɴ, because a Merkle proof can only show that an item is in the tree, not that an item is not in the tree. In other words, there’s no way to say “when Cᵢ is decrypted with K it produces a hash that is not in the Merkle tree with root hash H.”

Unencrypted prefixes

Our final trick is to prefix each clear-chunk hash Hᵢ with its number, from 1 through N, before putting it in the Merkle hash tree and computing H. We also prefix each encrypted chunk Cᵢ with the same number, unencrypted, and use these prefixed chunks to build the Merkle hash tree with root hash E. Now when the buyer finds that some chunk Cᵢ, when decrypted with key K, produces a chunk whose hash is not Hᵢ, they can claim a refund by showing:

  1. the prefix for both Cᵢ and Hᵢ (i.e., the number i)
  2. the chunk Cᵢ
  3. the hash Hᵢ
  4. a Merkle proof showing that prefix+Cᵢ is in E
  5. a Merkle proof showing that prefix+Hᵢ is in H

The smart contract verifies the proofs, which show that Cᵢ is indeed one of the encrypted chunks you sent, and Hᵢ is indeed the hash it should have when decrypted, because it has the same unique prefix in H that Cᵢ has in E. It only remains for the contract to decrypt Cᵢ itself (using K, which was revealed to the contract earlier) and hash it. If the result isn’t Hᵢ, the refund is authorized.

TxVM

We now present an implementation of these concepts using the TxVM blockchain design from Chain (now Interstellar). TxVM is optimized for the transfer of user-defined tokens secured by compact and efficient smart contracts.

The TxVM smart contract language includes native operations for hashing strings, needed for verifying Merkle proofs. It does not support encryption or decryption, but it does permit bitwise XOR of strings. Since the smart contract is required to perform decryption (when the buyer claims a refund), we must define our cipher in terms that TxVM can work with.

As discussed above, the content being transferred is already divided into N chunks. The implementation sets a chunk size of 8 kilobytes and numbers them from 0 through N-1. Each chunk is further divided into 256 “subchunks” numbered 0 through 255. Each subchunk is 32 bytes long. We encrypt or decrypt a subchunk by XORing it with the 32-byte hash of K||c||s,⁴ where K is the random key selected by the server, c is the chunk number, and s is the subchunk number, both numbers encoded in TxVM format.

The Go server and client implements this cipher here. The smart contract decryption subroutine in TxVM appears here.

A transfer begins when a buyer requests some content from a seller by its clear-chunk Merkle root hash. The request includes the amount and asset type of the proposed payment, along with proposed deadlines for the seller to reveal its decryption key and for the buyer to claim a refund if warranted. These proposal parameters allow the seller to reject the request as early as possible if necessary.

If the seller accepts the request, it responds by choosing a random cipher key and sending a stream of clear-chunk-hash/cipher-chunk pairs to the buyer, one pair for each chunk. It also computes the cipher-chunk Merkle root hash for the chunks it sends.

As discussed above, both the clear-chunk Merkle root hash and the cipher-chunk Merkle root hash are computed from strings that are prefixed with the (unencrypted) chunk number, encoded with LEB128 (which TxVM can interpret with the int instruction).

The buyer receives and stores the clear-chunk-hash/cipher-chunk pairs. It independently computes the cipher-chunk Merkle root hash. It also verifies that the clear-chunk Merkle root hash (computed from the received clear-chunk hashes) has the expected value. It then constructs a partial TxVM “payment proposal” transaction and sends that partial transaction to the seller.

The payment proposal transaction begins by unlocking enough of the buyer’s utxos to cover the proposed payment. They are merged together and placed on the argument stack to be consumed (along with other parameters) by the Tredd contract. Excess “change” is paid back to the buyer. [Code.]

The utxos are presumed to be locked with the standard “pay to signature program” contract. Unlocking them means signing a program that verifies they’re being used in the right transaction. Normally that program is one that checks the transaction ID, but this partial transaction doesn’t have a transaction ID yet. So instead we construct a signature program that inspects the transaction log to ensure it includes a call to the Tredd contract. The call must spend the merged utxos and have the right other parameters. If there’s change, the signature program also ensures it’s sent to the buyer. All of this is to ensure that the unlocked utxos cannot be used in any unintended way.

Once the payment and other parameters are marshaled onto the TxVM argument stack, the Tredd contract is inlined and called.

The partial transaction is sent to the seller. The seller executes it (encountering an expected “transaction not finalized” error) and inspects the transaction log to verify the Tredd contract, the payment, and the other parameters. The seller can reject the proposal if its expectations aren’t met.

The partial transaction leaves the TxVM virtual machine with phase 2 of the Tredd contract deferred (with yield) and on the top-level contract stack. To complete the transaction, the seller must call it with additional parameters: the random cipher key chosen earlier (making this the “reveal-key” transaction), the seller’s own public key, and a collateral payment.

We haven’t yet discussed the need for a collateral payment from the seller. It’s there to protect the buyer from a “griefing attack” by the seller, where the buyer commits funds to a Tredd transfer and the seller fails to publish the reveal-key transaction, or supplies the wrong key or data. In such cases the buyer recovers the funds they committed, of course, but additionally the seller forfeits their collateral payment to compensate the buyer for the resources dedicated to storing cipher chunks and for the time value of the locked-up funds. This implementation requires a collateral payment equal to the buyer’s payment.

So the seller marshals utxos, merging them together for use in the Tredd contract, splitting off any change and paying it back to itself. The seller also splits off a zero value for use by the finalize instruction. It adds other parameters, calls phase 2 of the Tredd contract, adds a finalize, and finally adds signatures for the new utxos (using the normal transaction-ID-verifying signature program, since the transaction is now finalized).

The seller publishes this transaction to the blockchain. This will fail if any of the utxos involved have already been spent elsewhere or if the reveal deadline has passed.

The buyer has been observing the blockchain. Once the seller publishes the completed transaction, the buyer recognizes it and parses its transaction log to discover the cipher key. It uses this to decrypt the cipher chunks it has stored. If every decrypted chunk’s hash matches the expected hash received earlier from the seller, then the buyer is done: it has made its payment and has received the promised content. The seller waits for the refund deadline to pass, then publishes a simple transaction claiming its payment from the buyer (and reclaiming the collateral).

On the other hand, if any decrypted chunk has the wrong hash, the buyer constructs a transaction presenting the complete faulty cipher chunk, the expected clear-chunk hash, and Merkle proofs that both of those values belong to the agreed-on Merkle root hashes. The Tredd contract verifies the claims (checking also that the refund deadline has not passed) and returns the buyer’s payment together with the seller’s collateral.

Open issues, out of scope, and future work

Before Tredd can be truly useful, a few other technological developments are needed, some more directly related to Tredd itself and some less so.

The TxVM blockchain. TxVM is a compact, efficient, and safe framework for value transfer and smart contracts, but it has three serious drawbacks:

  1. The programming language is very low-level;
  2. There is no distributed consensus model for TxVM;
  3. There is no privacy — all details of all transactions are public.

Before TxVM, Chain’s blockchain model used a different low-level language. A separate language, Ivy, allowed the creation of sophisticated smart contracts by layering high-level syntax and semantics on top of that. TxVM needs its own version of Ivy to simplify the creation, debugging, and maintenance of contracts and subroutines like the ones in Tredd.

The TxVM blockchain presently is used only in Sequence, a hosted-ledger product that has no need for a decentralized, trust-minimized network. To use TxVM for a real-world deployment of Tredd it would be necessary to distribute a shared ledger, which means using some consensus algorithm to adopt new blocks for the TxVM blockchain. An experiment in combining TxVM with the Stellar Consensus Protocol is in early stages.

Finally, a successor to TxVM is in development, called ZkVM, that includes strong privacy guarantees.

The content-addressable web. Tredd is an illustration of how the payment layer of a content-delivery system can work. It is not the full system itself. It relies on the buyer being able to discover and trust the hash of the content he or she wishes to buy. Without that trust, a buyer may request the content with hash H from a seller, only to discover — after payment, delivery, and decryption — that H was the hash of the string “Haha sucker.”

Projects like IPFS and IPLD (and to a lesser extent Perkeep) aim to update the data model of the existing World Wide Web to make it content-addressable. They are creating the infrastructure by which data is linked to by its own hash. Trusting a hash in such a system is akin to trusting a link on a conventional web page: it might be a link to malware or a rickroll but if it’s coming from a place you already trust then it’s probably going to be OK. Of course “probably” only goes so far when making irrevocable online payments, but on the other hand, Tredd is best used for small payments: think nickels for newspaper articles, not dollars for Hollywood blockbusters. (The existing credit-card-based system for buying those seems to suffice, and would be hard to displace in any event.) If a site fools you into paying a nickel for the wrong content, you’ve lost that nickel but gained information about that site that’s at least as valuable.

Piracy. What’s to stop a buyer who pays for some data from turning around and reselling it to others?

Technologically, nothing. But that’s a feature, not a bug. It incentivizes decentralized distribution and should tend to keep the price of content low, as buyers will have a choice of competing sellers for a given piece of content.

Content creators hoping to earn more should be able to charge a premium for seeding the network with the earliest copies of it. The buyers willing to pay that premium know that they can earn it back by reselling at lower prices but in larger volume.

Of course there are also non-technological means for preventing buyers from reselling digital data they buy, including intellectual property laws. Enforcement of those laws could conceivably be aided by the public and immutable history of Tredd transactions on the blockchain.

Acknowledgments

I am grateful to Dan Robinson, Oleg Andreev, Keith Rarick, and Nathaniel Borenstein for their feedback during the development of Tredd. Thanks also to the folks at Chain (now Interstellar) for supporting work on predecessors of Tredd, for developing and open-sourcing TxVM, and for educating me about blockchains and cryptography in the first place.

Endnotes

  1. Tredd is an example of a “zero-knowledge contingent payment” system. The first such payment was demonstrated with Bitcoin in 2016.
  2. https://youtu.be/fQaavQNGsMY
  3. Using a value derived from the content of a document as the identifier for that document is called content-addressability and has some useful properties. See also the section on Future work.
  4. || denotes concatenation.