Using the Truebit Filesystem

Harley Swick
Truebit
Published in
7 min readJan 16, 2019

Overview

If you are following the Truebit project you have hopefully heard about truebit-os. If you haven’t, you should read this post before continuing https://medium.com/truebit/developing-with-truebit-an-overview-86a2e3565e22. In summary, truebit-os is the client that network participants (miners) run to earn TRU tokens. The client handles the interactions between Ethereum and the off chain functionality like the WebAssembly interpreter and IPFS. The Truebit filesystem is a component of the truebit-os system. Funny enough, it was the inclusion of the filesystem that explains why it is called “truebit-os” and not “truebit-client”. Like a normal operating system, truebit-os uses the filesystem to work with data one would expect from a UNIX system. What is novel is this filesystem is built on top of Ethereum and IPFS. The complexity involved with such functionality is hidden by the abstractions inside truebit-os.

The part we can’t fully abstract away is how Task Givers make their data accessible to the rest of the Truebit network. When a task giver submits a task, they also submit a piece of data that allows the filesystem to retrieve the data for the program and the input data. Before we go into how files are represented and interacts with, we should take some time to go over what storage types Truebit offers now. Each storage type has its own trade-offs so it is important to discuss how each work and their pros and cons.

The first type of storage is simple Ethereum bytes. Note, the data is persisted in the Truebit filesystem’s contract. The filesystem is not meant to work with storage data on arbitrary contracts.This is useful for small pieces of data that later need to be used onchain. Also you don’t have to worry about data availability. This can be expensive depending on the current gas price and is limited by the gas limit. We do not support retrieving arbitrary data from any random smart contract. Access to particular smart contract methods for data sources can be added in the future. For now the only methods the client supports are the ones provided by the Truebit filesystem.

The second type of storage is encoding the data inside of a smart contract and deploying. This is not a typical functioning contract with methods coded by Solidity. It is the literal binary data stored as contract code. This is then later stored using low level calls to the EVM. Such an approach is a more exceptional use of a deployed contract, but it is functional. Be advised the trade-offs are similar to the storage type above (Filesystem bytes). Users might like this type in the case where the want to reuse the data, and be absolutely certain the data is accessible in the future.

The third type of storage, and probably most popular is IPFS. This storage type certainly scales much better then the previous types. You won’t have to worry about the gas limit. The gas limit is not very large so most useful files will probably be stored here. Beware!!! IPFS does not have the same data availability guarantees as Ethereum. There is no guarantee the data is replicated. Jason Teutsch wrote an article outlining this problem https://medium.com/truebit/a-file-system-dilemma-2bd81a2cba25. Luckily, there are some tricks to get around this that should work for most applications. The easiest trick is to take advantage of the built in incentive’s a particular Dapp provides. Dapp developers (Task Givers) have a vested self interest in solving this problem, so they should be incentivized to run IPFS nodes. They get around the ephemeral data problem by running their own IPFS nodes that replicate their task data, thus providing a much stronger data availability guarantee for their tasks. Or the protocol itself can incentivize the participants to run the nodes (this would be more decentralized, but slightly more complex).

Using Truebit Filesystem IRL

Now that we have covered the different types of storage provided by the Truebit filesystem, we will go into detail on how to use it. Truebit tasks take in input by reading in a file, and outputs data by writing out to a file. These are normal concepts for regular programs, but might seem out of place for a blockchain application. The functionality is rather straightforward though. The Truebit filesystem smart contract makes use of two basic entities. Bundles and files. Bundles, are like directories, in that they are just a collection of files. When we submit a task, we must submit a bundleID so the filesystem knows how to retrieve the WASM code and any input data. Task Giver’s can mix and match the types of files they want to use for a particular task as they see fit. The bundle contains the list of these files. What constitutes a “file” in the Truebit filesystem depends on its file type. Earlier, I outlined the different file types supported. Bytes, contracts, or IPFS. If the data is stored in bytes then all that is needed is the fileID. If it is a contract then the contract address is used to retrieve the data. IPFS data is retrieved using the IPFS hash.

Hopefully, at this point you at least have a rough idea how the filesystem works. Now we will walk through how to do this in Solidity code. Before a task giver can call createTask they need to make sure their task’s code file and input data are accessible. The filesystem contract provides a variety of different methods, for more details you can go here to check out an example here https://github.com/TrueBitFoundation/truebit-os/blob/master/scrypt-data/contract.sol

The rest of the post will assume you are using an interface around the filesystem smart contract. The first method we will need to use is to create an empty bundle.

bytes32 bundleID = filesystem.makeBundle(nonce);

The nonce is a unique number used to generate filesystem identifiers. We use the nonce for bundleID’s and fileID’s. Uniqueness is the responsibility of the transaction sender. The easiest solution is to use a random number generator to make an integer with several decimal places. The unique number is hashed with the address of the transaction sender. The bundleID is simply a hash of the nonce and transaction sender’s address (msg.sender) field provided by the EVM.

Now that we have an empty bundle we will need to start creating some files. Here is how to create a file for the three different types of storage provided.

For persisting a file as bytes in the contract storage we do this:

bytes32 fileID = filesystem.createFileWithContents("input.data", nonce, data, data.length);

We named the file “input.data”. We need a nonce provided by the user for the unique ID. We need the data, which must be of type bytes32[] , and the last argument is the length of data.

If we want to add a deployed smart contract to the Truebit filesystem we do it like this:

bytes32 fileID = filesystem.addContractFile("input.data", nonce, contractAddress, root, size);

We named the file “input.data” like earlier. We need a nonce. The key piece to note is the contractAddress. This the 32 byte address we use for any smart contract address. We will also need the merkle root of the bytes in data. There are a number of ways to generate this root. truebit-os contains a node module to do this: https://github.com/TrueBitFoundation/truebit-os/blob/master/wasm-client/merkle-computer/merkleRoot.js. Finally we need to provide the length of bytes as the size argument. Please note putting any random smart contract in here will not work the way you are expecting it to.

At this point some of the patterns should start to be apparent, and registering IPFS data is not very different.

bytes32 fileId = filesystem.addIPFSFile("input.data", size, ipfsHash, merkleRoot, nonce);

size is the byte length of the IPFS file. ipfsHash is the IPFS file identifier. merkleRoot is the hash created by merklizing the IPFS file being used as input. The nonce is obtained like we did earlier.

So we’ve created our file, but it isn’t inside of our bundle. So we can fill up the bundle with our files like this:

filesystem.addToBundle(bundleID, fileID);

After the files are associated to the bundleID we have one more step. A bundle is not complete without the code file. In terms of storage, code files are indistinguishable from input or output files. However, the Truebit system does treat them somewhat differently so it is helpful to distinguish them. Also, for the most part, a code file will be reused for many tasks. Where as the input data and output data will probably be different per task. So we include the code file into the bundle when we finalize it. The process of finalization involves generating a merkle root of the bundle and storing it for use in our Truebit tasks. This is done by calling this method:

filesystem.finalizeBundleIPFS(bundleId, codeFileID);

All we need is the bundleID and the fileID of the code file. And that’s it! We can now use this bundle with our Truebit task.

Conclusion

Storing data in a trust-less way is not exactly trivial, it is my hope that this article and the Truebit filesystem itself provides some insight on how to solve such problems. Please keep in mind that a lot of this tech is very primitive. Personally maintaining an IPFS server or Ethereum storage are not exactly ideal, but they work today. As new tech arises around this domain, we plan to extend the filesystem using the protocol outlined above. We are very excited to see what is created by the greater decentralized community.

--

--