A walkthrough of EVM — Part 1 of N

Nicola Bernini
Ethereum Virtual Machine Walkthrough
3 min readAug 14, 2021

In this series of posts I am going to present some code snippets in EVM Assembly and to explain how they work

The reasons for this series post are the following

Even though the security assessment of a project can probably be done remaining at the Solidity level, being able to get down the EVM Assembly level and to understand and reason in terms of the EVM Assembly, is important to properly estimate the gas efficiency of the generated code

Furthermore, understanding the details of the EVM is required to be able to contribute to the discussion of its continuous improvement via the EIP process

Before starting, here is a brief technical introduction about the EVM and its design

Technical Introduction

  1. EVM Architecture

The EVM is a stack-based state machine that works with words of size 256 bit = 32 bytes
This is a major difference with respect to the other non-blockchain architectures (e.g. microprocessors), typically relying on 32 bit and 64 bit words, or even smaller for embedded architectures

This means opcodes arguments, so the atomic amount of a space on the stack, is 32 bytes

What’s the rationale behind this choice?

Well, the crypto applications typically require the arithmetic manipulation of large values

Even though this value is pretty large, the gas costs related to manipulating the non persistent storage, so Stack and Memory, is pretty low: order of magnitude is units of gas.
The manipulation of the persistent storage is way more expensive, order of magnitude is thousands of gas, but most of the current applications are more computationally intensive than storage intensive so they mostly use the non persistent storage

2. EVM Memory Model

Since storage, as computation, is expensive, the EVM provides different kinds of memory, allowing for a gas efficient data manipulation

There is one persistent storage and its content is persisted after the contract execution terminates, all the other types of storage are temporary meaning their content is not persisted after the contract execution terminates

These are the different types of memory related operations
1. Random Access for Read
2. Random Access for Write, can be divided into
2.1 First write → modifying from the default value
2.2 Non first write → changing a value that has already been changed

The reasons behind this fine-grained discrimination is the first write also performs a sort of “memory allocation” or “memory expansion” operation under the hood: the EVM does not require the programmer to explicitly masnage the memory, doing manual allocation and deallocation (like in C and C++ for example).

So since every memory location is readable, if it has not been changed the default value for the given type is returned, and writeable, doing memory expansion under the hood, then out of memory errors are not possible
Furthermore, operations like writing beyond arrays boundaries, something that in C and C++ is very bad, is a safe and common practice for arrays expansion.

Practical Examples

Example 1

PUSH 80
PUSH 40
MSTORE

So here is the stack content when MSTOREis reached

0: 0x00000...40
1: 0x00000...80

Remember the stack is a LIFO structure so at its top you have the most recently pushed element

Let’s check the MSTORE Specifications

mstore(p, v) → mem[p…(p+32)) := v

So in this case we have

p = 0x0000...40
v = 0x0000...80

What this code does is to store in memoryat the address 0x0000…40 the value 0x0000…80 then emptying the stack

So the memory layout becomes

0x0000…00: 0x0000…0000
0x0000…10: 0x0000…0000
0x0000…20: 0x0000…0000
0x0000…30: 0x0000…0000
0x0000…40: 0x0000…0000
0x0000…50: 0x0000…0080

Example 2

CALLVALUE

This is generated when in Solidity the msg.valuefield is accessed

It pushes on the top of the stack the msg.value of the used in the call

So situation on the stack before

0x0000…00: 0x0000…XXX
0x0000…10: 0x0000…XXX

Situation on the stack after CALLVALUEis

0x0000…00: [msg.value]
0x0000…10: 0x0000…0XXX
0x0000…20: 0x0000…0XXX

In the next post, more examples about the EVM opcodes analysis will be presented, including

  • manipulation of the storage and the related cost analysis
  • contract to contract calls

--

--

Nicola Bernini
Ethereum Virtual Machine Walkthrough

Machine Learning PhD, Physicist. Mainly interested in Deep Learning, Functional Programming. https://github.com/NicolaBernini