Reference source code https://github.com/lightszero/neovmbook/tree/master/samples/neovm02
I’ve been talking about the assembler before, and in the process description, we didn’t seriously discuss this issue.
We know that NEOVM is an implementation of the Turing machine, but the Turing machine is a tape drive, and the standard tape drive is not modular.
You think about listening to music with the old-style tape drive, can you jump to the next song with one click?
Organizing data in the unit of songs is modularization. CD can do that, but tape can’t.
But in software engineering practice, the first important issue is modularity.
High-level languages are, of course, modular, and functions are the most popular modular units. Later, with the popularity of oop, classes were created.
But before the high-level languages were popular, software engineers worked modularly.
How to implement modularity in machine language
In the machine, there is only one instruction area, and the memory area defines the modularity in the instruction area. If you have studied the oldest Basic language, there is only one code file, and there is no function support. We will adopt
go to [linenum]
We achieve modularity in this way: Different parts of the code achieve different functions
Let’s put this issue into NEOVM. We use the JMP instruction and the CALL instruction to realize the modularization of the code.
Let’s consider a piece of AVM code
0x00 PUSH 1
0x01 PUSH 2
0x03 CALL +4
In fact, it is divided into two modules. 0x000x08 is the ADD module.
In the absence of modular tools, engineers must plan how the modules are divided in memory. This is a very tedious task. Software engineering is only possible with modularization.
Since modularity is so important; it is natural to have a modular aid.
Now we have an assembler project.
If we use the ASML language that we defined to express it with modularity, it is
PUSH 1//push 1 number
Engineers think and write code one modular after another, instead of considering which memory block is in which module.
The work of considering module and address translation relationships is often called a link.
For example, the C++ language has a very clear and independent link process.
The CALL instruction is used to do function-level modularity
JMP instructions are used to do modularization inside functions
Our assembler has the function of a linker that automatically connects the two modules, assigns them the appropriate address segments, and lets the CALL parameters automatically point to where they are supposed to point.
Now it’s assembly; the next step is high-level language, this process is the same.
The final job of the compiler is the address translation, which involves assigning an address area to the module and providing the correct address to the CALL instruction to generate the final AVM byte.
Because our assembler has modular and Linker work, then we explain that the compilation process becomes two parts.
Or other virtual machines intermediate language such as IL->AVML->byte
No more details on how other compilers handle Linker's work.
Having said the CALL instruction, let me talk about the JMP instruction.
Think about such code
aaa and bbb are submodules inside the two functions.
If there is no modular expression, that’s it; we still have to deal with the address.
0x00 PUSH 1
0x01 JMPIF +3
0x02 PUSH 1
0x04 PUSH 8
If we use the modular ASML we defined to represent
Don’t care about the address. I introduced a label as a jump location.
80% of the work of high-level language’s conversion process to assembly language is the process of various loops becoming JMP.
Ignore the address translation work of JMP and CALL instructions. This work is left to Linker. In the next article, we will discuss how high-level languages are compiled into NEOVM instructions.