Introduction to EVM — Part 1

Somu Bhargava
5 min readNov 1, 2021

--

Photo by executium adapted from unsplash

Just like how a processor is crucial to a computer, the EVM is crucial to the Ethereum State machine. It is mainly used to execute the logic of Smart Contracts and make the corresponding state transition. Now let’s take a look at Ethereum briefly before we look at EVM.

Brief Intro to Ethereum

Ethereum is an open-source, decentralized blockchain with the additional functionality of Smart Contracts. It indicates that broadly the kinds of transactions happening on the network are either ETH transfers (or) Smart Contract Deployments (or) Smart Contract Calls — All of which change the state of the Ethereum blockchain. A bunch of these transactions are combined in a particular order to form a block, such that the combined gas required for the execution of the transactions in that block, is less than or equal to the gas limit of the block.

A transaction looks like the following —

Transaction -
nonce - Number of txs sent by the sender
gas price - Price of gas (in WEI) for this tx
gas limit - Maximum amount of gas to be used by the tx
to - Beneficiary of this tx
value - Amount of WEI transferred to a beneficiary
data - Input data of the message call
v, r, s - Recoverable Secp256K1 signature of the sender

The world state of Ethereum is nothing but a mapping between addresses and their corresponding account state. An account state looks like this —

Account -
nonce - Number of txs sent by this account till now
balance - Amount of wei held by this account
storage - Mapping between 256 bit key-value pairs
code - Immutable EVM code belonging to this account. This code
will be executed when someone makes a transaction to this
account.

A miner forms a block by clubbing some transactions into a block and obtaining the proof-of-work for that block. This block is later broadcasted in the network to the other nodes participating in the Ethereum blockchain. Now the other nodes who received this block have to verify the block’s validity as the miner might be malicious (or) there could have been a transmission error (or) a man-in-the-middle attack. Each node verifies the block in the following aspects (roughly)—

  • They verify that the block properties (Like block number, parent hash, timestamp, gas limit, gas used, etc.) are in accordance with the Ethereum protocol.
  • Then the transactions of the block are picked up and executed one by one. Execution of each transaction consumes some gas and changes the state of the blockchain. Hence at the end of block execution, we have a resultant state which can be uniquely represented by the root of the state trie. If both the miner and the verifying node are in agreement about the protocol and the transactions, then the state root should be unique. The miner adds the state root obtained in the state header, and later the other nodes validate their obtained state root against the one mentioned in the header.

Ethereum Virtual Machine (EVM) takes the responsibility of executing the transactions and updating the state of the blockchain. Let’s see more details about the EVM below.

Ethereum Virtual Machine (EVM)

Smart Contracts can be compiled down to EVM bytecode. As an analogy, think of Solidity code (common language for coding smart contracts) as C++ code. Think of the EVM bytecode as the Machine Code which is something that can be understood and be executed by the processor. Hence an EVM can be thought of as a processor to the Ethereum State Machine. The EVM bytecode is a sequence of opcodes and data, that can be processed by the EVM to result in the state change.

Hence the role of EVM in executing transactions is to —

  • Facilitate transfer of WEI (1 ETH = 10¹⁸ WEI) from one account to another.
  • In case a transaction is made where the beneficiary account has some bytecode associated with it, the EVM must execute the corresponding bytecode (potentially using the input data obtained from the field transaction.data)

Now can any account have a bytecode associated with it? The answer is no. There are 2 types of accounts in Ethereum — Externally Owned Accounts (EOA)and Contract Accounts (CA). EOAs are accounts with a private key associated with them and are handled by external entities like humans, organizations, etc. Contract Accounts, on the other hand, are created by the deployment of smart contracts. They don’t have an associated private key and are controlled by code calls (via transactions on the blockchain) made to them by external entities.

Roughly speaking, an instance of the EVM is launched in each node to execute each transaction. However, the EVM instance executes bytecode only if that transaction’s beneficiary (or target) is a Contract Account.

Now let us take a look at the architecture of the EVM.

Diagram Adapted from https://ethereum.org/en/developers/docs/evm/

Each instance of the EVM is launched for running a specific bytecode (Given the transaction’s target is a contract account). Hence the bytecode acts like a ROM for the EVM instance and is immutable. Similar to a Turing Machine, EVM has a Program Counter, Stack, Memory, and external storage. The external storage is persistent across all the transactions, but the rest of the components are volatile and get reinstantiated for every instance of the EVM.

Let’s take a look at each of these components in detail —

  • Program Counter (PC) is nothing but a pointer to the next opcode in the bytecode, which is to be executed by the EVM. It is a non-negative integer in the range [0, number_of_bytes(bytecode)-1]
  • Stack in the EVM can have a maximum of 1024 entries and each entry being 256-bit (32 bytes) unsigned integer.
  • Memory in the EVM is infinitely expandable (although you have to pay extra gas for the memory expansion itself) and each entry is an 8-bit (1 byte) unsigned integer.
  • The external storage here is nothing but the collection of the stores of all the accounts. (The EVM bytecode may write to the storage of the target account or an external account)

Similar to how the processors in computers understand specific opcodes as per their instruction set, there are opcodes understood by the EVM. Each EVM Opcode is a byte long, hence as per theory there can be a maximum of 256 opcodes, though all 256 opcodes are not present at this point. The EVM opcodes can be mainly classified as follows —

  • Opcodes manipulating the state of the components like PC, Stack, Memory, and Storage.
  • Arithmetic and Bitwise operations.
  • Environmental Information — Information about attributes of a block or attributes of the current transaction or attributes of a particular account.
  • Logging Operations — Adding log records.
  • System Operations — Creating a new contract account, making a message call to another account, Destroying the created contract accounts, etc.

A more detailed post on the Opcodes is coming up soon. Stay tuned, and be sure to keep checking this space for more information.

--

--