Updates on Ethereum’s Moon Project

Formal proofs, parallel FVM and ideas for mass adoption

Hello /r/ethereum. You may (probably not) remember me from my short talk at Devcon3. There, I presented a working prototype of Moon-Browser. My goal was to build a place where people could not only use Ethereum DApps, but also:

  1. Inspect and modify their codes in a real-time editor;
  2. Verify formal proofs about back-end logic (smart-contracts);
  3. Fork entire apps (including the back-end!) in a single click.

The last one in particular being, I believe, the main benefit decentralization will bring to society. Imagine being able to fork YouTube in a single click. Not just the front-end, not just a video, but the entire company, operating and working, in a single click. That’s, in my opinion, what Ethereum has to offer, and, thus, what I wanted Moon to become. Since then, I’ve had a lot of exciting progress, which I’ll share on this long post. If you’re in a hurry, feel free to just read the summary below!

Update summary

  1. We’re developing a new smart-contract language, Formality, which features formal proofs. It is based on Aaron Stump’s adaptation of his minimalistic type theory to Ethereum, which will be openly specified on the Cedilleum repository, and is designed from ground up to be efficient in a decentralized VM setting.
  2. The FVM, a new virtual machine capable of evaluating Formality programs natively on the GPU. It is based on a new, massively parallel model of computation, the Abstract Calculus, which mixes the best of Turing Machines and the Lambda Calculus: high-order functions and algebraic datatypes, no garbage-collection, beta-optimality and a simple cost-model with only 2 compute opcodes.
  3. A native implementation of the Moon-Browser in Rust, running front-end DApps directly on the GPU (through the FVM!), which allows us to have blazingly-fast native 2D and 3D applications.

There are many parts of this developed and the current ETA for all to come together is late 2018, although we’re a small team and, thus, will make no promises. Now, on to the full story! But, before, here’s a pretty fan-art of the Zelda: Majora’s Mask’s Moon, as otherwise Medium would show my face on thumbnails, and hopefully nobody wants that.

Twilight Termina, by Spire-III

Update #1: Formality

So far, the main challenge (and most of this year’s effort) went to address the formal proof necessity. That term is a little bit mystified, so, let me explain in simple terms what it is, and why I thoroughly believe it to be a pre-requisite for what we want to achieve.

What are formal proofs?

(Please skip this section if you already know.)

Formal proofs can be described as non-probabilistic unit-tests. That is, when we’re interested in checking if a property holds, the software industry standard practice is to write tests, which check it for a huge set of concrete cases. For example, if we just implemented a subtract(x,y) function and wanted to be sure it is correct, we could write a bunch of tests:

assert(subtract(7, 3) + 3 == 7)
assert(subtract(2, 4) + 4 == 2)
assert(subtract(9, 9) + 9 == 9)
assert(subtract(6, 5) + 5 == 1)
assert(subtract(1, 9) + 9 == 9)

The more tests pass, the more confident we get that it is correct indeed. The problem with tests is that they’re probabilistic at best. No matter how many tests you write, you can never be 100% sure your code is correct. For example, this definition:

function subtract(a, b) {
for (var i = 0; i < b; ++i) {
a -= 1;
}
return a;
}

Passes all tests above, but is not incorrect (can you see why?). If we, instead, used a language featuring formal proofs, we could’ve written this:

subtract_is_correct
: ∀ n m . subtract(n,m) + m = n
= ... proof goes here ...

Here, the first line is the name of a property we want to prove, the second line is the property itself, and the third line is its proof (here omitted). This property says that, for every pair of numbers, n — m + m == n. This is exactly what we tested with specific cases, except, now, we have a symbolic proof that this property is true for all cases. As you can see, there’s nothing mystic; proofs are just tests on steroids. If the “formal proof check” passes, that’s as if an infinitely long test-suite, covering all cases, passed.

Why formal proofs are necessary?

(If you agree formal proofs are required for DApps, please skip this section.)

1. The DAO incident

Back in 2016, the Ethereum community gathered together to develop “The DAO”, a decentralized company (that is, a smart-contract) that received “donations” from the community and reinvested that money in projects aiming the growth of the ecosystem. All seemed amazing, until, suddenly, 3.6m Ether (today equivalent to about 1 billion USD) was stolen from the contract’s funds. That case almost killed Ethereum and is explained here.

The problem was a very complex bug that, under highly unexpected circumstances, allowed an user to withdrawal Ether without having the internal balance of the contract updated. Keep in mind that The DAO had thousands of unit tests and was audited by several professionals, and pretty much the entire community took a look at its code, including me and Vitalik, yet, this bug went unnoticed to the eyes of everyone, allowing for perhaps the largest digital theft in history to happen.

2. How formal proofs would’ve avoided it

What if Moon existed back in 2016? Then, DAO users would be able to access the “proof-panel”, where they would see, among other things, this definition:

the_daos_balance_invariant_holds
: ∀st. sum st.user_balances + st.total_invested == st.balance
= <Checks!_Click_to_expand_proof.>

This can be read as “the sum of the balances of every user, plus the amount of Ether invested, is equal to the balance of the contract”, which is exactly the property that was violated by The DAO theft! Your browser could be able to verify the validity of your claim offline, convincing your that the DAO’s fund would never, ever be stolen. If you don’t see why that is so important, let me explain. Before Moon, DApp users needed to either:

  1. Trust a bunch of unit tests;
  2. Trust the developers of a DApp;
  3. Trust the auditors;
  4. Read and analyze thousands of lines of code himself.

Now, if blockchains are all about the decentralization of trust, how come we’re so tolerant to trusting developers? That shouldn’t be that way, and that’s the very problem formal proofs address. With them, users can audit entire code bases by performing offline checks of small specifications. In other words, formal proofs reduce the human work required to audit a program of size N to O(1), allowing users to “audit it themselves” without needing to analyze every possible state of a huge software. This decentralizes the trust involved in verifying code correctness, which is necessary given the very premise of “decentralized” applications. If we want companies such as exchanges or lending platforms to operate by themselves, with no human intervention and no justice system to appeal to in case they do wrong, then those companies must be able to convince us they’re honest, and formal proofs is their way to do that.

What we have done?

Hopefully we now agree that formal proofs are a need for truly decentralized applications. Sadly, using them for smart contracts isn’t easy. The reason is stupid: languages that feature formal proofs are just too slow. For example, on this paper, authors have attempted to write Ethereum contracts in Idris, a functional language with theorem proving, and compile them to the EVM. Sadly, all the intermediate compilation steps caused the code to become way too bloated and heavy. Since, on the EVM, every operation costs real money, the whole idea was not too practical.

One might wonder if we could build a smarter compiler for Idris-EVM. The answer, I fear, is probably not. While Idris does a great job at erasing proofs from the compilation process, it (and most popular proof assistants such as Agda, Idris, Coq and so on) still rely heavily on high-order functions, which require dedicated runtimes, garbage collection and a bunch of other stuff that, simply, bloat the execution. Those things are acceptable for everyday apps, but, for smart contracts, where even the tiniest operation has a cost, that can be the difference between 1 cent and 1 dollar transactions. In practice, people would just abandon formally verified contracts for cheaper assembly-optimized ones.

As an alternative, there are some non-functional languages with formal proofs. Those are very fast, but I admit the syntax is scary to me and I’m not sure we could turn it into something as clean and novice-friendly as, for example, Elm.

Another alternative would be to, instead, develop smart-contracts as a DSL (which could be even the EVM itself) inside a host language that features formal proofs. That is, the DSL would be an “EVM-friendly” language, and you would prove things about programs in that DSL, not the host language. That is a good direction, and there are even attempts at formalizing the EVM, but, in my opinion, it is not satisfactory. For one, it introduces yet another language to the mix, making the browser much harder to learn. Proofs also become more complex, because instead of proving facts about native programs, you must prove facts about the execution of an interpreted language. Surely, that’s how people often do it today, but, then, most of those working with formal proofs are highly skilled specialists. This strategy doesn’t scale as I doubt the average developer would do it as part of a daily routine.

As such, I decided we would need to develop our own language, one that was built from scratch with smart contracts in mind, machine-friendly yet high-level enough to be able to be given a novice-friendly syntax. Unfortunately, as you may expect, creating a language with formal proofs is a dreading task that requires a lot of theoretical background and two PhDs. That’s why I reached Aaron Stump, a researcher for the University of Iowa who developed a very simple type theory; so simple that I could attempt at implementing it myself in an evening! Of course, that repository is missing some important details, but was meaningful. After exchanging some ideas with him, he agreed to develop a blockchain-friendly version of that language: Cedilleum. Now we have one of his PhD students developing that specification, and my team-mate, Leonardo Souza, working in a Rust implementation of it, which we’ll call Formality. The main reason Aaron’s work is so interesting for this project is that the simplicity of his type theory means anyone could implement it; i.e., an average dev could make his own proof checker. That removes the last layer of software-correctness trust, i.e., the trust on Moon-Browser itself! After all, if you don’t trust DApp developers, why would you trust me?

Update #2: Abstract Calculus and the parallel FVM

So, what exactly is my plan to make Formality lightweight enough to be compatible with smart-contracts? Let’s first remember the main problem that causes functional languages to be slow is high-order functions. Those require garbage collection, closures and other things that add a lot of weight to the execution. Not only that, the very task of applying a function is, in most languages, executed in a sub-optimal way (in the same sense that bubble-sort is a sub-optimal way to sort lists). That includes even functional languages like Haskell, with it’s STG runtime. So, what now?

1. A perfect model of computation for functional smart contracts

Fortunately, I’ve been researching the implementation of functional languages for a while, which allowed me to come up with the Abstract Calculus. It is a textual language based on a new computational model, interaction nets, which combine the best aspects of the Lambda Calculus and Turing Machines. The details are quite complex, but let me try to summarize.

Like the Lambda Calculus, it allows us to express high-order functions, maps, folds and all those things that are used by any sane language that features formal proofs. Like Turing Machines (and unlike the Lambda Calculus), it has a very clear cost model, being evaluated in a series of constant-time operations that translate well to hardware code, requires no closures, garbage collection and is is optimal. In other words, the Abstract Calculus is a language with many of the features we enjoy from functional languages, but with performance characteristics of imperative languages!

All we need to do is, thus, compile Formality to the Abstract Calculus, which can then be compiled efficiently to eWASM. That’s possible by giving Formality a slightly unusual type-checking procedure, which has some structural restrictions that languages such as Idris don’t have. Hopefully, those restrictions will be masked away in the user-facing syntax of Formality.

For more details on the Abstract Calculus, check its repository.

2. The FVM: a massively parallel evaluator for it

We won’t stop there, though. The Abstract Calculus has something new to offer that neither Turing Machines nor the Lambda Calculus have: parallelism. Except when there are direct data dependencies, interaction nets are able to execute every single part of our program at the same time; i.e., every a+b, every arr[i], every iteration of a non-sequential loop will run simultaneously. Interaction nets have been shown to present a near-ideal speedup when evaluated in multiple threads. This lead me to decide on creating the FVM, a new VM, like eWASM or the EVM, but on which the cost of running Formality programs will be much lower than it’d be via eWASM compilation, due to exploiting the parallelism possibilities.

For a quick test, I’ve implemented the Abstract Calculus in OpenCL (code available here) and managed to get 150m rewrites/s in a high-end GPU. This equates to about 75m function applications, or about 50m pattern-matches per second, which blows even Haskell’s GHC out of the water; and I consider myself a GPU noob, so, needless to say, I have high expectations for how much miners will be able to speed this thing up once it is in the wild.

Update #3: native Moon-Browser in Rust

Our current plan is to release a native version of the browser in Rust, which is why everything else is currently being written in it. The reason for that is Rust is, currently, clearly the fastest language with high-level features such as algebraic datatypes and a Haskell-like type-system. In fact, it is so fast that, according to the Compute Benchmark Game, it competes with (and sometimes beats) C, and beats Java and GO in every single benchmark. Moreover, Rust has a bunch of GPL-licensed Ethereum libraries written in it due to Parity, making the choice even easier.

The browser, as explained on my talk, has a decentralized URL bar, allowing you to load DApps from IPFS and Swarm. It is natively connected to Ethereum, allowing you to either set your own provider node, or to open a native Geth/Parity node. The editor, differently from what was presented, will display Formality code, not Moon-Lang. For smart contracts, that code will include a “formal proofs” section, where properties about the back-end logic are proven and verified offline in a trustless fashion. For front-end interfaces, that code will be compiled to OpenGL, allowing it to run native applications.

Conclusion

That’s what I have to say for now. The contents of this post will be presented on my talk at Devcon4. If you want to keep updated about this development, you can either follow me on Twitter, or send an email to victor.maia-at-ethereum-dot-org. Don’t forget to like and subscribe and thanks for watching!