Compile AVM — High-Level Language — Variables

robbie wang
NewEconoLabs
Published in
3 min readNov 5, 2019

--

This article is written by Li Jianying(Light Li) in Chinese and translated into English by Robbie.

Reference source code: https://github.com/lightszero/neovmbook/tree/master/samples/compiler_csharp01

We have already discussed address translation before, so this article is just about how high-level languages are converted to AVM.

How to compile variables

The first issue is the variable. In the high-level language, we are accustomed to variables, but in the NEOVM instruction sequence, there is obviously no variable.

int a=0;

Then such a high-level language instruction is obviously unconvertible.

int a=1;

int b=2;

return a + b;

We have to think about what we have in NEOVM. We have a calculation stack and a temporary stack. It is easier to calculate a simple evaluation expression, such as “1+2+3+4”, but once the variable appears. It seems to be more complicated.

The calculation of constants is easy to express with the calculation stack

const int a=0;

const int b=2;

return a+b;

==>

PUSH 0

PUSH 2

ADD

RET

The reason the variable is troublesome is that the variable may change, so the variable should come from a position instead of a specific value.

For example, a list of variables, let’s take a look at the program with the idea of a list of variables.

Suppose we have a global list of variables

//we have a List<int> values;

int a=1; //a is values[0]

//values[0] = 1;

int b=2; //b is values[1]

//values[1] = 2;

return a + b;

//return values[0]+values[1]

In fact, to compile this code, we need to create a variable list. We first design two pseudo-codes to manipulate our variable list. STLOC puts the value into the variable table. LDLOC takes the value from the variable table.

Use pseudocode to indicate this program. It’s

//int a=1

PUSH 1

STLOC 0

//int b=2

PUSH 2

STLOC 1

//return a+b

LDLOC 0

LDLOC 1

ADD

RET

Then we write the variable list directly in the form of code NEWARRAY PICKITEM SETITEM. NEOVM has these operations. we create a variable list in the temporary stack when the function starts and remove the variable list in the function RET.

//CreateArray size=2

PUSH 2

NEWARRAY

TOALTSTACK

//int a=1

DUPFROMALTSTACK //getarray

PUSH 0//index

PUSH 1//value

SETITEM

//int b=2

DUPFROMALTSTACK //getarray

PUSH 1//index

PUSH 2//value

SETITEM

//get value a

DUPFROMALTSTACK //getarray

PUSH 0//index

PICKITEM

//get value b

DUPFROMALTSTACK //getarray

PUSH 1//index

PICKITEM

//add

ADD

//return

//cleararray

FROMALTSTACK

DROP

RET

You can find this program under samples/compiler_csharp01

His code is divided into several parts. Step01 is to translate the c# source code into the abstract syntax tree (AST). Here we can directly solve the problem by calling rosyln. No matter what high-level language you plan to compile, we can basically translate it into AST, in which there are a lot of ready-made things to use.

Step02 is the part that turns AST into the assembly. This part is the main work of the compiler.

Step03 is the job of the linker. No matter what you want to compile, from what to compile, this part is always the same. In the next article, we will discuss code with the same function compiled from IL to AVM, then you will find that step03 is still the same code.

Then it is to summon NEOVM to test, no doubt, you will get the result 3.

class Program

{

static void Main()

{

int a=1;

int b=2;

return a+b;

}

}

//result 3

This is its output

--

--