Compile AVM — High-Level Language — Variables
This article is written by Li Jianying(Light Li) in Chinese and translated into English by Robbie.
Reference source code: https://github.com/lightszero/neovmbook/tree/master/samples/compiler_csharp01
We have already discussed address translation before, so this article is just about how high-level languages are converted to AVM.
How to compile variables
The first issue is the variable. In the high-level language, we are accustomed to variables, but in the NEOVM instruction sequence, there is obviously no variable.
int a=0;
Then such a high-level language instruction is obviously unconvertible.
int a=1;
int b=2;
return a + b;
We have to think about what we have in NEOVM. We have a calculation stack and a temporary stack. It is easier to calculate a simple evaluation expression, such as “1+2+3+4”, but once the variable appears. It seems to be more complicated.
The calculation of constants is easy to express with the calculation stack
const int a=0;
const int b=2;
return a+b;
==>
PUSH 0
PUSH 2
ADD
RET
The reason the variable is troublesome is that the variable may change, so the variable should come from a position instead of a specific value.
For example, a list of variables, let’s take a look at the program with the idea of a list of variables.
Suppose we have a global list of variables
//we have a List<int> values;
int a=1; //a is values[0]
//values[0] = 1;
int b=2; //b is values[1]
//values[1] = 2;
return a + b;
//return values[0]+values[1]
In fact, to compile this code, we need to create a variable list. We first design two pseudo-codes to manipulate our variable list. STLOC puts the value into the variable table. LDLOC takes the value from the variable table.
Use pseudocode to indicate this program. It’s
//int a=1
PUSH 1
STLOC 0
//int b=2
PUSH 2
STLOC 1
//return a+b
LDLOC 0
LDLOC 1
ADD
RET
Then we write the variable list directly in the form of code NEWARRAY PICKITEM SETITEM. NEOVM has these operations. we create a variable list in the temporary stack when the function starts and remove the variable list in the function RET.
//CreateArray size=2
PUSH 2
NEWARRAY
TOALTSTACK
//int a=1
DUPFROMALTSTACK //getarray
PUSH 0//index
PUSH 1//value
SETITEM
//int b=2
DUPFROMALTSTACK //getarray
PUSH 1//index
PUSH 2//value
SETITEM
//get value a
DUPFROMALTSTACK //getarray
PUSH 0//index
PICKITEM
//get value b
DUPFROMALTSTACK //getarray
PUSH 1//index
PICKITEM
//add
ADD
//return
//cleararray
FROMALTSTACK
DROP
RET
You can find this program under samples/compiler_csharp01
His code is divided into several parts. Step01 is to translate the c# source code into the abstract syntax tree (AST). Here we can directly solve the problem by calling rosyln. No matter what high-level language you plan to compile, we can basically translate it into AST, in which there are a lot of ready-made things to use.
Step02 is the part that turns AST into the assembly. This part is the main work of the compiler.
Step03 is the job of the linker. No matter what you want to compile, from what to compile, this part is always the same. In the next article, we will discuss code with the same function compiled from IL to AVM, then you will find that step03 is still the same code.
Then it is to summon NEOVM to test, no doubt, you will get the result 3.
class Program
{
static void Main()
{
int a=1;
int b=2;
return a+b;
}
}
//result 3
This is its output