Take a moment to contemplate the complexity of the software infrastructure that increasingly permeates all aspects of our lives. It involves operating systems, software libraries, tools, protocols, programming languages, compilers, and other technologies. This environment is the result of a continuous global effort during the past 50 years and has consumed billions of hours of work. It is this network of interdependent components that allowed software to become so pervasive. Without it, none of the services and applications we have come to rely on would have been possible, or even conceivable.
Blockchains are a comparatively recent breakthrough. The key innovation of Bitcoin was a mechanism through which a network of agents can maintain decentralized consensus over a shared state. In this context, decentralization means that the state is maintained by the participants themselves in a way that does not require mutual trust. This state contains, among other data, a payment system. The stake that each participant holds in the resulting economy works as their incentive for making this state widely available and for rejecting invalid changes to the state. This virtuous cycle makes the system self-sustaining: The payment system is built on top of the decentralized consensus, which only functions because of the incentives created by the payment system itself.
Ethereum significantly generalized this idea by allowing transactions to be expressed as arbitrary programs, or smart contracts. It soon became apparent that programmable transactions over a decentralized state could be used for much more than simply maintaining a payment system. Blockchains essentially became equitable, decentralized ‘’world computers’’ that cannot be turned off and cannot be controlled by individuals or institutions. A new kind of (decentralized) application emerged, called DApp, to run on these world computers.
When compared with traditional computer programs, DApps face two major limitations. The first is one is scalability. Because smart contracts must be executed by all participants in a blockchain, DApps are crippled by stringent limits on computation, slow transaction rates, tight storage space, and high price. This issue has been widely recognized as one of the key factors limiting the growth of blockchain adoption. As a result, several current projects focus on developing scalability solutions for blockchains (e.g. Plasma, sharding, side chains, state channels, etc).
The second limitation has received much less attention: primitive software infrastructure. Here, the culprit is the lack of an operating system. Operating systems are the basis upon which all of the past 50 years of software development is built. Cut off from all this previous work, DApp developers struggle to accomplish tasks that are trivial for conventional developers, like transferring files, compressing data, finding the color of a pixel in an image, or querying a database for a record. Even the concept of a file itself is missing!
There is a chasm between today’s immature blockchain software and the vast software infrastructure that is readily available and running on our laptops and smartphones.
DApps should be capable of performing computations like the ones “centralized” applications regularly do. DApp development should have access to the languages, libraries, and tools that are standard in modern operating systems like Linux. Going even further, we should strive for a higher-level infrastructure that is generic and scalable while abstracting away the nuts and bolts of individual blockchains and consensus layers. Developers should be able to focus on their specific DApp logic instead of dealing with the intricacies of state channels, side-chains, and the blockchain itself. DApps should be easily portable across different blockchains.
This kind of infrastructure would blur the lines between the skills required to develop centralized and decentralized applications. It would pave the way for a new generation of DApps that are as inconceivable today as the modern Internet was 50 years ago. We at Cartesi have been pursuing this inevitable future. To that end, we specified and implemented a decentralized Linux infrastructure for scalable blockchain applications.
Before getting into the weeds of our project, let us take a short detour into the basics of layer-2 systems for scalable DApps. Systems like Plasma, state channels, TrueBit, and Cartesi differ from each other in many respects, but they are based on the same fundamental concepts.
Blockchains achieve consensus by redundancy, replicating computation and data across the entire network. Because of that, DApps cannot rely on them for high-frequency transactions, for complex processing, or for storing large amounts of data. This would be expensive, wasteful, and inefficient. The range of practical DApps is severely constrained by these limitations. In contrast, centralized applications running on servers on the cloud (or even on our personal devices) do not suffer from these limitations. What we need is to combine the performance and generality of conventional software systems with the consensus and security guarantees of blockchains.
Although global consensus is an expensive resource, most of the data and processing associated with a DApp (i.e., apart from financial aspects) are of no interest to the entire network. They only matter to the few parties involved in that specific DApp. Because of this, it is almost always sufficient for the involved parties to maintain their own local consensus. The strong guarantees offered by the global consensus of the entire blockchain network can be used sparingly to resolve eventual disputes between these parties.
The general idea is as follows. All parties involved with a given DApp are bound together by smart contracts with well defined crypto-economic incentives that reward honest behavior and punish dishonesty. A common design is to have smart contracts require all involved parties to place a deposit as collateral before engaging with the DApp. When all participants agree with a change to the local state of the DApp, they communicate their agreement and set of changes to the blockchain. The blockchain then reacts by modifying the relevant global state. When any involved party has proof of misbehavior, they can appeal to the blockchain before the changes become permanent.
The blockchain becomes a referee that arbitrates in favor of the honest party, rewarding him with the collateral seized from the bad actor.
This simple yet powerful idea minimizes blockchain resource consumption while preserving the strong security guarantees of the entire network. This arbitration mechanism is inspired by age-old interactions between humans. We don’t engage the supreme court before each human interaction. The vast majority of transactions, even those involving money, are settled locally. We don’t call a lawyer before we go out and buy fruits in a street market. That would be extremely inefficient. Instead, we have a system in place that is used only when necessary: The judicial system is only engaged in case of legal disputes.
Applying this metaphor to well-designed blockchain systems involving crypto economic incentives is the key to efficient, scalable DApps.
Cartesi: Linux on the blockchain
We have just seen how a layer-2 system can decouple local interactions from the blockchain while preserving its security guarantees. We can now use this decoupling to significantly increase the expressiveness and power of the infrastructure. In particular, we create a paradigm shift in DApp development by transforming the platform from what feels like a toy spreadsheet into something that behaves like a real computer running Linux.
The Cartesi Core (see figure 1) is a set of basic technologies that constitute the first step towards the vision we have been discussing. It includes on-chain and off-chain components. The off-chain component is what we call the Cartesi Node. In a nutshell, each participant that wants to interact with a Cartesi DApp does so through a Cartesi Node. They can either run their own Cartesi Node or hire a third party that does to represent their interests.
The essential module in the Cartesi Node is the Cartesi Machine, a custom virtual machine that emulates a RISC-V microprocessor and runs an embedded Linux distribution. When developing a DApp with Cartesi, most of the decentralized logic will be coded to run under Linux off-chain, instead of coded as smart contracts and deployed on-chain.
Cartesi Machines are special in many ways. First, they are completely self-contained. This means that there are no external sources of entropy that can alter the machine state in a non-deterministic way (such as a keyboard, or mouse, or a wall-time clock). Second, they are reproducible. In other words, when two Cartesi Machines start from the same initial state and run for the same number of processor cycles, they reach exactly the same final state. This is true regardless of the architecture and operating system hosting the Cartesi Node. Finally, Cartesi Machines offer full transparency. The state is specified and exposed in such a way that any external observer with access to the entire state at a given processor cycle can predict the entire state at the next processor cycle.
The self-containment and reproducibility properties ensure that any two parties that have access to the initial state of a Cartesi Machine will obtain the same results when they run the machine in their Cartesi Nodes. Naturally, this is vital for consensus. Unfortunately, it is not, in general, guaranteed by hardware architectures or virtual machines. The Cartesi Machine was designed specifically to satisfy these requirements. The transparency property means the entire state of a Cartesi Machine can be Merklized. Among other things, this enables the blockchain to represent a machine with a single hash. As we will see, these properties allow the blockchain to efficiently settle any disputes over the results of executing Cartesi Machines off-chain.
The Machine Representation
Off-chain, the Cartesi Machine is represented explicitly (see figure 2, left). Its main components are its memory and an assortment of drives. In a typical configuration, the Linux kernel is loaded into memory, and drive 1 contains a filesystem with an embedded Linux distribution. The distribution can be configured to boot and execute any program in this filesystem. A hypothetical program could process the input found in drive 2 and write the results to drive 3. From the outside, drives are simply files residing in the Cartesi Node. The Cartesi Machine exposes them to the Linux kernel as devices. From the point of view of programs running under Linux, they can be mounted as regular file systems and can contain multiple directories and files. Running the Cartesi Machine will produce an output file in the Cartersi Node that contains another filesystem, and inside this filesystem are the desired results of the computation.
On-chain, the Cartesi Machine is represented by the Merkle tree root hash of its entire state (see figure 2, right). This state includes the memory and drives, but also everything else needed by the Cartesi Machine’s operation (such as the values of all registers and devices). Merkle tree root hashes are special in that they support two important operations. First, it is possible to verify or reject any claim over the value of any piece of the state. Second, it is possible to obtain the new Merkle tree root hash corresponding to any modification to the state. These operations can be performed very efficiently within the blockchain. Smart contracts can, for example, replace an entire drive in a Cartesi Machine and obtain the corresponding new root hash. Conversely, given the final root hash for a Cartesi Machine, and the hash of an output drive, smart contracts can verify that the drive really belongs to that machine.
Consider two actors, Alice and Bob, both involved in a transaction controlled by a smart contract. Let us assume that the transaction involves some complex computation that is impractical to execute on-chain. Now assume further that the smart contract knows the initial root hash for a Cartesi Machine that, when booted, runs a program in drive 1 that performs the required computation using the input from drive 2 and writes the output to drive 3. Using efficient hash operations, the smart contract can replace drive 2 with its desired input. It can then demand from Bob the root hash that results from executing the corresponding Cartesi Machine off-chain.
There are now two possibilities. If Alice agrees with Bob, the smart contract can proceed by requesting from Bob details about drive 3, satisfied that both Alice and Bob will agree with the consequences. If Alice disagrees, she starts a dispute by depositing a collateral. If Bobs wants to defend against the dispute, he answers by depositing the same collateral. (Otherwise, after a timeout, he is considered to have forfeited.) What follows is an interactive protocol, involving Alice and Bob and mediated by the blockchain, that guarantees that whoever is being honest will win the dispute.
The verification game [Feige and Kilian 1997] is an algorithm that allows an arbiter with limited computational resources to mediate a game between two computationally unlimited players. In our case, the limited mediator is a set of smart contracts running on the blockchain. The unlimited players are the Cartesi Machines running on Alice’s and Bob’s Cartesi Nodes. This procedure is also used by TrueBit, and a detailed explanation can be found in this great article by Sina Habibian.
In a nutshell, using an n-ary search, a smart contract identifies the last processor cycle for which Alice and Bob agree on the root hash for the entire state of the Cartesi Machine. Then, the blockchain executes a single RISC-V instruction to obtain the root hash for the state that follows. Alice and Bob are known to disagree on this hash. By identifying who posted the correct results, the smart contract can arbitrate in favor of the honest party and reward him or her with both collaterals.
Running a single RISC-V instruction is painless and very cost-effective for the blockchain. Locating the instruction takes only logarithmic time and space. Moreover, this process is only engaged in case a dispute arises. Disputes are exceedingly rare, because cheating will always be caught, and cheating brings punishment.
A Complete Example: Turn-based Games
As a concrete example, we will show how the Cartesi Core can greatly simplify the work of a DApp developer implementing a turn-based game, such as Go or Chess. Alice and Bob will play for money, winner takes all. The players are represented in the game by the DApp components running inside their Cartesi Nodes. They start a new game by depositing the value of their bets into a smart contract. After this, the game continues almost entirely off-chain.
Alice and Bob keep the entire sequence of moves Merklized off-chain. Let’s call the root hash of this tree the sequence hash. When it is Alice’s turn to make a move, she appends it to the sequence, obtains a new sequence hash, and signs it. She then sends the move, sequence hash, and signature to Bob using a state channel. Bob validates Alice’s move, appends it to his copy of the move sequence, computes the sequence hash, confirms it matches Alice’s, and then verifies her signature. If everything checks out, it is now his turn. The game proceeds in this way until one of two things happen: they prematurely stop playing the game correctly, or one of them finally wins the game.
Let us assume Bob wants to claim victory. He posts Alice’s sequence hash, her signature, and his winning move to the blockchain. A smart contract first verifies Alice’s signature on the sequence hash. It then delegates the decision of who won to an “end-of-game” Cartesi Machine (running initially on Bob's node itself, as we will show shortly). This is a machine that, given a sequence of moves in drive 2 and a final move in drive 3, halts with a success exit code if the move wins the game, or with failure, if it doesn’t. The software that runs inside this Cartesi Machine was written by the developer in the programming language of his choice. It is free to take advantage of all the modern facilities and open-source software available within a Linux operating system running without any artificial resource limitations on a self-contained RISC-V machine.
To define the initial state of the end-of-game machine, the smart contract uses hash operations to replace drive 2 with the sequence of moves provided by Alice and to replace drive 3 with Bob’s final move. It demands that Bob run the corresponding machine off-chain and post the Merkle-tree root hash of its final state to the blockchain. If Bob is telling the truth, he will be able to defend against any potential challenges by Alice concerning the value of this hash. Bob will also be able to prove to the blockchain that the end-of-game machine halted in his favor by showing that the word designated as the exit code in its memory has the appropriate value. If Bob is lying, however, Alice will dispute his final hash and win the ensuing verification game.
An interesting problem happens when the players prematurely stop collaborating with each other. Perhaps it is Bob’s turn and, realizing his position is hopeless, he refuses to send his move to Alice. On its own, the blockchain has no way of knowing if this is really happening, or if Alice is the one ignoring Bob’s attempts at contact through the state channel. Fortunately, there are ways of solving this data availability problem. A simple solution is as follows. Frustrated with Bob’s behavior, Alice informs the blockchain of his refusal to cooperate. To do so, she submits Bob’s last sequence hash, and his signature. The blockchain offers Bob the chance of submitting an even more recent sequence hash signed by Alice. Both players are now required to submit future moves through the blockchain.
From now on, all the blockchain has to do is control whose turn it is, accept a new move from the correct player, and use hash operations to keep the sequence hash up to date. If one of the players stops posting moves, the other can claim a timeout and win the bet. If one of the players claims victory, the blockchain has all the information needed to build the end-of-game machine as before: problem solved.
Advantages of this Design
The advantages of this design are that all complexity, processing, and data storage associated with the game happens outside the blockchain. Part of it runs in the Cartesi Node, and part in the Cartesi Machine. Both systems run Linux, where developers are most productive. In the typical scenario where players cooperate, the blockchain simply accepts the bets when a game starts and transfers them to the winner when it ends. When players stop cooperating, the blockchain still has very little work to do: It keeps only the sequence hash up to date. It never stores the sequence of moves, never needs access to the state of the game, or even cares about its rules. Even when a dispute arises, it is automatically settled by the Cartesi Core.
Consider the productivity gains a design like this brings to blockchain development and the increased range of applications this added scalability enables. We can only begin to imagine.
Within the past 20 months, most of the Cartesi Core, as described in our white paper, has been implemented. In future articles, we will showcase its power and flexibility by building example DApps. Needless to say, these DApps will involve verifiable computations over large amounts of data, while leveraging the vast software infrastructure available to developers on Linux.
Cartesi’s Core is only the first step towards our vision. Cartesi’s mission is to close the gap between centralized and decentralized applications, both in terms of possibility and convenience.
We will continue to work towards eliminating all limitations from DApps powered by Cartesi. We will offer seamless state channels and a verifiable off-chain data-exchange protocol. We will create a new programming environment that integrates the development of all the components of a Cartesi DApp. Ultimately, we will make DApps easily portable across different blockchains.
Our hope is that, by empowering DApp developers, we will help them build products that are ever more compelling to users, and bring us all closer to a future in which decentralization becomes less burdensome.