Isekai LLVM: final update

Viktor Krapivenskiy

Published in

Sikoba Network

6 min readSep 3, 2019

This is the third and the last article in our series. Previous articles: update #1, update #2.

Authors: Ivan Glushenkov, Viktor Krapivenskiy

During the last three weeks, we have added some additional features to the LLVM isekai front-end, and fixed one curious corner-case issue in our decompilation algorithm. In this article, we discuss this problem and the solution we have found, and show some examples that isekai is now able to compile as a result of our work.

Short-circuit operators are control flow operators

You may know that C has && and || operators. They have short-circuit evaluation semantics, meaning that the right operand is only evaluated if the result cannot be determined by the left operand. For &&, this means that the right operand is not evaluated if the left is false; for ||, the right is not evaluated if the left is true.

What this means is that control flow can be performed using these operators; in fact, it is a common idiom in Perl to do something like

perform_action or die "Error: cannot perform action";

…where the die part is only executed if perform_action returns a false-y value.

The surprising thing is that the following program:

…when compiled with clang -O0, generates the following control flow graph:

It does not fit into the patterns we described in the previous article. The article describing the Stackifier algorithm says that, when we cannot reduce the control flow graph into loops and conditionals, we either have to do code duplication or introduce an additional variable.

The Stackifier algorithm itself chooses the latter. As the isekai already does code duplication when it performs loop unrolling, we thus chose code duplication, and came up with an algorithm to check if the control flow graph actually fits into the patterns we described previously, and perform code duplication if this is not the case.

The algorithm is the following:

1. Create a global multiset of basic blocks.
2. When traversing a block, check if it is present in the multiset; if so, throw an exception of the InvalidGraph type.
3. If a block ends with a conditional branch and we think it is a conditional statement (not a loop), then:
4. Insert the “sink” into the multiset.
5. Traverse all the branches.
6. If done successfully (no exception was thrown during the traversal), then delete the current sink from the multiset.
7. If an InvalidGraph exception was thrown during the traversal, then try with different sink: delete the sink from the multiset and change it to the least common ancestor (see the previous article) of all its successors, then traverse all the branches again. If the old sink had no successors, or if the result is the old sink, then throw an InvalidGraph exception.

Of course, we do all this as a separate, pre-processing step.

This algorithm does not worsen the complexity of the whole “decompilation” algorithm if the result is reducible, and the worst-case complexity is proportional to the amount of the code that would have to be generated.

As you can see on these test cases, the algorithm works correctly: isekai generates the circuit that evaluates to the correct answer on all possible input values for 5 input variables.

Examples

List of features we’ve added:
1. Support for arbitrary pointers: dynamic array indices, arbitrary pointers.
2. Basic detection of undefined behavior: when compiling this example with the--progress option (which turns on verbose output), isekai produces the following warning: INFO -- : possible undefined behavior: load from uninitialized pointer.
3. Support for array and complex structures, both on stack and in the input/output structures: basic example, CRC-32 calculation.
4. Basic support for C++: basic example, SHA-256 calculation.

We have written a lot of different tests for our front-end, which mostly focus on language features. They can be found here.

We have also written about 300 back-end tests.

We set up Travis CI and you can see here that all the tests pass.

Performance

We measured the performance of isekai depending on the size of the input on a real-world example: CRC-32 calculation. As you can see, it is able to generate a 825 Mb arithmetic circuit in 26 seconds.

Circuit generation time and output size plots for CRC-32 example

This is pretty fast, as on the same machine,

dd if=/dev/zero of=null.bin bs=8K count=104K

runs in 8.4 seconds (block size of 8K is used to emulate Crystal’s File behavior, which is a wrapper over the stdio FILE*, which has 8K buffer by default in glibc; 104K×8K≈825M).

Fun fact: bug found in the Crystal compiler

It was crazy to realize that we have crashed the compiler! Big thanks to the Crystal language developers for the quick response and fix.

Summary

We have created an LLVM front-end for isekai and have written a comprehensive test suite for it. On behalf of the Serotonin Syndrome team I thank Sikoba Research and everybody involved for the interesting tasks.
The whole implementation can be found here.

Future

We think there are many potential directions for future work.

A new programming language

The C standard was written with hardware in mind — very, very old hardware. One artifact of this is that C requires the use of either of the following three representations for signed numbers: sign-and-magnitude, one’s complement, two’s complement. Modern hardware does not support anything but two’s complement, but at the time the first C standard was written, there were machines that did.

The problem with this, however, is that neither representation is compatible with modulo-p arithmetic, for some large prime p. Currently, we use two’s complement representation, but this comes at a cost in terms of the number of field operations, especially when performing comparisons.

Moreover, C has no type for which unsigned overflow would be undefined; we only have undefined behavior on signed overflow, which is not especially useful in our case.

We, therefore, suggest that a new domain-specific programming language with semantics friendly to modulo-p arithmetic be developed.

This is quite a challenging task, but, done right, it may facilitate mass adoption of the verifiable computation technology — think the Solidity programming language and the smart contract technology.

Back-end optimization

Although quite smart, the back-end misses some optimization opportunities, mainly related to the caching of split vs. joined results, and bitwise operations (e.g. if we need to xor a lot of 1-bit numbers, calculating the result as the least significant bit of the sum is more efficient than what is done now).

More optimizations in the back-end means faster execution, proof generation and verification.

WebAssembly front-end

WebAssembly is an open standard for portable binary code. Just like C, it was designed with hardware in mind —but, unlike C, modern kind of hardware, so it requires two’s complement integers, IEEE 754–2008 floating point, little endian, lock-free atomic memory operators, and so on.

By building a WebAssembly front-end, we would enable more languages to be supported by isekai. However, due to the requirement of two’s complement for integers, WebAssembly does not seem to be suitable for building a new programming language for verifiable computations on top of.

Support for more LLVM features

By adding support for non-inline functions and the bitcast instruction, we would extend the supported subset of C and C++ languages to quite a large one.

The next possible step is to add support for more languages, for example, Go, Rust and Solidity.