Explaining Ethereum Contract ABI & EVM Bytecode

Eiki Takeuchi
4 min readJul 16, 2019

Explaining Ethereum Contract ABI and EVM Bytecode with theory and command-line practice

Introduction

This article explains Contract ABI & EVM bytecode in Ethereum. As Ethereum uses EVM(Ethereum Virtual Machine) as the heart of the system, smart contract code written in high-level languages needs to be compiled into EVM bytecode and Contract ABI to be run. It is necessary to understand them when you interact with the smart contract.

What you can get from this article

  • Understanding what Contract ABI and EVM bytecode are and their relations.
  • How to generate Contract ABI & EVM bytecode with the sole command line

Not explained

  • Details of Contract ABI Specification(encode/decode).
  • How to write a smart contract
  • The basic explanation of Ethereum and blockchain

*Readers are expected to have basic knowledge of Ethereum and blockchain.

Bytecode & ABI

As Ethereum uses EVM(Ethereum Virtual Machine) as a core component of the network, smart contract code written in high-level languages needs to be compiled into EVM bytecode to be run. EMV Bytecode is an executable code on EVM, and Contract ABI is an interface to interact with EVM bytecode. For example, suppose you want to call a function in a smart contract with your JavaScript code. In that case, ABI is an intermediary between your JavaScript code and EVM bytecode to interact with each other. The diagram below shows the architecture of Contract ABI, EVM bytecode, and outside components(dApp and network). The left side is a process of compiling and the right side interacts.

EVM bytecode(Bytecode)

EVM bytecode is a low-level programming language compiled from a high-level programming language such as solidity. EVM is a virtual machine between the OS and application layer to mitigate OS dependency. Thankfully, to EVM, the Ethereum smart contract can be run on almost any computer. As a Java developer, you can think of JVM(Java Virtual Machine) as the exact mechanism. EVM bytecode looks like below. It is not human-readable but readable for the machine.

6080604052348015600f57600080fd5b5060878061001e6000396000f3fe6080604052348015600f57600080fd5b506004361060285760003560e01c8063037a417c14602d575b600080fd5b60336049565b6040518082815260200191505060405180910390f35b6000600190509056fea265627a7a7230582050d33093e20eb388eec760ca84ba30ec42dadbdeb8edf5cd8b261e89b8d4279264736f6c634300050a0032

More profoundly, when compiling on Remix IDE, you see four fields below. They represent more details of bytecode such as link reference, opcodes, and source Map. ‘object’ is an EVM bytecode.

linkReference: deployed address of address of other smart contracts on which the current smart contract has a dependency.object: Current smart contract bytecodeopcodes: Operation codes that are human-readable low-level instructions.sourceMap: ource map is to match each contract instruction to the section of the source code from which it was generated.

Contract ABI

In computer science, ABI(Application Binary Interface) is an interface between two program modules, often between an operating system and user programs. In Ethereum, Contract ABI is an interface that defines a standard scheme of how to call functions in a smart contract and get data back. Contract ABI is designed for external use to enable application-to-contract and contract-to-contract interaction. For example, if you want to call an intelligent contract function from your dApp, you call via Contract ABI. Contract ABI is represented in JSON format like below.

Contract ABI defines function names and argument data types. It is used to encode contract calls for the EVM and to read data out of transactions. There is a precise specification of how to encode and decode Contract ABI. I will use the below function to describe the example of encoding.

function withdraw(uint withdraw_amount) public {}

First, the “withdraw” function will be encoded with keccak256, and the first 4 bytes will be used as a selector. The selector is a mark to identify which function to call.

// Encode function with keccak256.
> web3.utils.sha3(“withdraw(uint256)”)
0x2e1a7d4d13322e7b96f9a57413e1525c250fb7a9021cf91d1540d5b69f16a49f
// Extract first 4 bytes.
0x2c1a7d4d

Next, the argument will be encoded in hex decimal and appended to the encoded “withdraw” function with 32 bytes of padding.

// Convert from ETH to Wei.
> withdraw_amount = web3.utils.toWei(“0.01", “ether”);
10000000000000000
// Convert Wei with hexdecimal.
> withdraw_amount_hex = web3.toHex(withdraw_mount);
0x2386f26fc10000
// Left padding.
> withdraw_amount_padleft = web3.utils.leftPad(withdraw_amount_hex, 32);
0x0000000000000000002386f26fc10000
// Append to selector(encoded function).
“0x2c1a7d4d” + withdraw_amount_padleft
// Final encoded ABI.
0x2c1a7d4d0x0000000000000000002386f26fc10000

The data invokes the withdraw function and requests 0.01 as the argument. If you want details about ABI encoding/decoding specifications, please refer to Contract ABI Specification.

When interacting with the contract, you can use web3.js like the one below. First, contract with Contract ABI, and next, create an instance with the EVM bytecode. This code is generated by Solidity REMIX when compiling is successful.

Commands with “sold.”

Let’s generate Contract ABI and EVM bytecode with the’solc’ command. Solc command is one of the most popularly used compilers. Let’s install it with an npm package manager.

Install

$ npm install -g solc

We will use this sample source code. The filename is SampleToken.sol.

Output EVM Bytecode

$ solc --bin SampleToken.sol
> ======= SampleContract.sol:SampleContract =======
Binary:
6080604052348015600f57600080fd5b5060878061001e6000396000f3fe6080604052348015600f57600080fd5b506004361060285760003560e01c8063037a417c14602d575b600080fd5b60336049565b6040518082815260200191505060405180910390f35b6000600190509056fea265627a7a7230582050d33093e20eb388eec760ca84ba30ec42dadbdeb8edf5cd8b261e89b8d4279264736f6c634300050a0032

Output Contract ABI

$ sold —abi SampleToken.sol
> ======= SampleContract.sol:SampleContract =======
Contract JSON ABI
[{"constant":true,"inputs":[],"name":"testFunc","outputs":[{"name":"","type":"int256"}],"payable":false,"stateMutability":"pure","type":"function"}]

If you want to output to a specific directory, you can set it with the “-o” option. (you cannot set the output file name).

$ mkdir build
$ solc --abi -o build SampleToken.sol

When you re-compile, set the “ — overwrite” option.

$ solc --abi -o build —overwrite SampleToken.sol

Get Help

$ solc --help

Conclusion

The article explained Contract ABI and EMV Bytecode. EVM Bytecode is a compiled source code from the high-level programming language, and Contract ABI is an interface to interact with the EVM bytecode. Both can be compiled with the’ sold’ command line, mainly used in Ethereum development. I hope the article helps you understand Contract ABI and the EVM bytecode practically. If you have any questions or comments, they are always welcome. Thank you.

--

--

Eiki Takeuchi

I'm Eiki Takeuchi. I work as a Scrum Master/Agile coach. I regularly write about Scrum, Agile, and leadership on Medium. X: https://x.com/eiki234