Writing Smart Contracts in IULIA

People keep asking about the status of our new low-level language IULIA and are often surprised when I tell them that it is already used in the Solidity compiler. Perhaps you have even used IULIA in the past without noticing. The reason is that IULIA shares a lot of code and structure with what was previously called “inline assembly”. For the compiler, the distinction between the two is often not even noticeable.

Since people sometimes like writing smart contracts very close to the metal (or rather close to the ether), I would like to explain how you can use IULIA to write smart contracts directly. Or perhaps almost directly. One component is missing for writing smart contracts in IULIA and that is the ability to reference code from within code. This is needed for the deployment (“constructor”) part of a smart contract. We already have a specification about that, but it is not yet implemented in the Solidity compiler.

Because of that, I will use the inline assembly tools for Solidity so that the compiler will create the necessary wrappers for us. Using IULIA inside of Solidity instead of stand-alone also has the benefit that we can use existing Solidity tools like the remix IDE and debugger.

Since IULIA is designed to compile to different backends including the EVM but also Ethereum-flavoured WebAssembly, it comes with a certain set of built-in functions which are not exactly the EVM opcodes. While these built-in functions are not yet implemented, there is a flavour of IULIA which uses EVM opcodes instead of the built-in functions and is currently used as inline assembly for Solidity and also inside the compiler. Translating these opcodes to fully-fledged IULIA built-ins will be trivial, though.

So what we are doing here is that we are using the syntactic elements of IULIA, remove types and replace the built-in functions by the EVM opcodes (in functional notation).

As an example, we will implement a simple ERC20 token contract. Let us start with the surrounding Solidity code:

interface ERC20 {
function totalSupply() constant returns (uint totalSupply);
function balanceOf(address _owner) constant
returns (uint balance);
function mint(address _addr, uint _value);
function transfer(address _to, uint _value);
}
contract MyToken {
function MyToken() {
assembly {
// init code comes here
}
}
function() payable {
assembly {
// runtime code comes here
}
}
}

This snippet defines the ERC20 token interface (we will use that later to interface with the contract) and a skeleton for our own token. This is the only code we will write in Solidity.

If you want to try it out, paste it into https://remix.ethereum.org, switch to the “run” tab, select MyToken and click “create”. This creates a contract without any interface. We will later implement the ERC20 interface as specified above, although that will not be visible to the compiler. Because of that, we have to nudge remix a little: Click the “clipboard” symbol next to the MyToken contract just created to copy its address into the clipboard. Then paste the address into the input field next to “At address”. Next, select the “ERC20” interface in the drop-down element just above it and click “At address”. Now you should have a new ERC20 instance with buttons for all the functions that are part of the interface. If you click one of these buttons now, the relevant function on MyToken will be called (just that it is not implemented yet).

OK, now on to implementing those functions. First, we have to come up with how we want to use storage.

  • Slot 0: owner / creator
  • Slot 1: total balance (we have to update this all the time)
  • Slot 0x1000 + address: account balances

After having verified that there will be no overlap in storage, we can start with the creation code:

contract MyToken {
function MyToken() {
assembly {
// Store the creator at storage slot zero
sstore(0, caller())
}
}
// ...
}

The IULIA snippet will result in the following stream of opcodes:

CALLER PUSH1 0 SSTORE

This stores the caller/sender/creator in the first storage slot. The functional notation is probably much easier to read, but essentially, all components in such an expression are turned into EVM opcodes by just reading from right to left, ignoring any parenthesis structure. The main thing the compiler does for you is checking that the number of arguments (and later their types) matches the parameters.

The next this we will do is implement an accidental Ether transfer to our token:

contract MyToken {
function MyToken() {
// ...
}
function () payable {
assembly {
// Protection against sending Ether
if gt(callvalue(), 0) { revert(0, 0) }
      // ...
}
}
}

If-statements take a single expression as condition and if it evaluates to non-zero (with introduction of types, it will check against true/false), the body is executed and skipped otherwise. This snippet reverts the call if it contains a nonzero Ether transfer.

After that, we have to read the calldata and decide which function is to be called. This is called a function dispatcher. IULIA has a convenient switch statement for that purpose (I will omit the Solidity code surrounding the IULIA code from now on):

// Protection against sending Ether
if gt(callvalue(), 0) { revert(0, 0) }
// Dispatcher
switch selector()
case 0x70a08231 /* "balanceOf(address)" */ {
returnUint(balanceOf(decodeAsAddress(0)))
}
case 0x18160ddd /* "totalSupply()" */ {
returnUint(totalSupply())
}
case 0xa9059cbb /* "transfer(address,uint256)" */ {
transfer(decodeAsAddress(0), decodeAsUint(1))
}
case 0x40c10f19 /* "mint(address,uint256)" */ {
mint(decodeAsAddress(0), decodeAsUint(1))
}
default {
revert(0, 0)
}

So far, so boring. But actually not: What you might have noticed is that this code uses identifiers which are not EVM opcodes, but their syntax is exactly the same. This is also why we can talk about IULIA and Solidity inline assembly more or less interchangeably: The core language allows the user to define functions (more on that below), but there is not a single built-in language, not even the EVM opcodes. Then there is the actual IULIA flavour that defines some built-in functions and the Solidity inline assembly that uses EVM opcodes as built-in functions. From a semantics standpoint, it does not matter where and how the function is defined, only its semantics matter. Because of that you can have the same code compiling to different backends, have the same optimizer routines, the same simulator and only the way these built-in functions are in the end defined in the target machine is different.

You might have noticed the functions called decodeAsAddress, decodeAsUint and returnUint. These handle how calldata is decoded and encodde and we will take a look at them next. By the way, it does not matter whether you first define a function and use it later or define it after it is used, it only needs to be in scope (i.e. not inside a pair of curly braces in a deeper nesting layer).

function selector() -> s {
s := div(calldataload(0),
0x100000000000000000000000000000000000000000000000000000000)
}
function decodeAsAddress(offset) -> v {
v := decodeAsUint(offset)
if iszero(iszero(and(v,
not(0xffffffffffffffffffffffffffffffffffffffff)))) {
revert(0, 0)
}
}
function decodeAsUint(offset) -> v {
v := calldataload(add(4, mul(offset, 0x20)))
}

This snippet shows the first real difference between LLL and IULIA: It is possible to define functions and (stack) variables. LLL only knows macros and it is possible to use these macros to define memory variables and things that look similar to functions. Another difference between LLL and IULIA is where the parentheses are placed, but that is just a matter of taste.

The function selector() returns the first four bytes of the calldata, properly right-aligned to ease comparisons. The other two decode values of type address and uint256, respectively where the correct value range is checked for the former. In both cases, the argument is the index of the parameter in the ABI starting with zero. For both functions, the return variable is called v and assigned in the first line.

Function definitions and calls are an abstract thing in IULIA and they will be mapped to proper functions for WebAssembly. Since the EVM does not have functions, they have to be emulated with jumps, where the return PC and the arguments and return values are placed on the stack. But don’t worry, the compiler will do that behind the scenes and the translation is actually very simple.

Note that (internal) function calls in the EVM are rather cheap. The only reason they might be avoided is because they are called with certain arguments where further optimization might be possible for these specific arguments. If you take a look at how decodeAsAddress is used in the balanceOf part of the dispatcher above, you notice an opportunity of optimisation:

case 0x70a08231 /* "balanceOf(address)" */ {
returnUint(balanceOf(decodeAsAddress(0)))
}

Since it is called with 0 as argument, which is forwarded to decodeAsUint, which in turn performs

v := calldataload(add(4, mul(offset, 0x20))

we see that a mere

calldataload(4)

would do the whole trick.

Exactly because of this reason, we are currently working on an optimizing compiler for IULIA that can reduce such cases to much better performing code. Every single stage of the optimizer is easy to understand and can also emit intermediate IULIA code in readable text representation. This makes it possible to even verify the optimized code after the compiler did its job.

The other benefit of such an optimizing compiler is that you can write modular code. We could have just used

returnUint(balanceOf(calldataload(4))

but this would have removed the value range check and would also be much more error-prone and harder to maintain. What happens if we want to add another parameter, for example? We would have to re-calculate the proper position in the calldata ourselves all the time.

Ok, let us continue with other helper functions. We still need the value encoder:

function returnUint(v) {
mstore(0, v)
return(0, 0x20)
}

This is where we first use this magical thing called “memory”: It is a semi-volatile, byte-addressed storage. The IULIA compiler will not use it for internals, but knows how to handle it, meaning that concurrent reads and writes will work correctly even after the optimizer changed the code.

Now let us add some functions to handle the internal logic of the token:

function mint(account, amount) {
if iszero(calledByOwner()) { revert(0, 0) }
mintTokens(amount)
addToBalance(account, amount)
}
function transfer(to, amount) {
deductFromBalance(caller(), amount)
addToBalance(to, amount)
}

This basically only calls helper functions. The mint function reverts the call if it is not called by the owner and the transfer function removes an amount from one account and adds it to another account.

function owner() -> o {
o := sload(0)
}
function totalSupply() -> supply {
supply := sload(1)
}

Remember that we said that we store the owner in the first slot and the total supply in the second? This is where this convention is used. These two functions are of course also candidates for inlining.

function mintTokens(amount) {
sstore(1, safeAdd(totalSupply(), amount))
}
function accountToStorageOffset(account) -> offset {
offset := add(0x1000, account)
}
function balanceOf(account) -> bal {
bal := sload(accountToStorageOffset(account))
}
function addToBalance(account, amount) {
let offset := accountToStorageOffset(account)
sstore(offset, safeAdd(sload(offset), amount))
}
function deductFromBalance(account, amount) {
let offset := accountToStorageOffset(account)
let bal := sload(offset)
if lt(bal, amount) { revert(0, 0) }
sstore(offset, sub(bal, amount))
}

The only two functions that are missing now are safeAdd and calledByOwner which are defined as follows:

function safeAdd(a, b) -> r {
r := add(a, b)
if or(lt(r, a), lt(r, b)) { revert(0, 0) }
}
function calledByOwner() -> cbo {
cbo := eq(owner(), caller())
}

The safeAdd function reverts in case of an overflow as a side-effect of its evaluation.

The full contract can be found below, please try it out in remix with the method explained in the beginning.

I hope you enjoyed this tour of IULIA and I also hope that you find it simple and flexible enough at the same time, perhaps not for implementing full smart contract but at least for nice and readable helper functions. I am open for any suggestions, comments and ideas.

By the way, there is a single syntactical element we did not cover, and this is the for-loop. In total, there are only the following 9 elements:

  • literal
  • function call (including built-ins and opcodes)
  • function definition
  • variable declaration (can have multiple variables)
  • variable assignment (can have multiple variables)
  • if statement
  • switch statement
  • for statement
  • statement block

Of these, only the first two can make up expressions.