Understanding Bytecode on Ethereum

Shane Fontaine
Authereum
Published in
3 min readSep 29, 2019

--

tl;dr — There are only two types of bytecode on Ethereum but five different names to describe them.

tl;dr bytecode differences

Ethereum developers have no doubt dealt with bytecode when writing smart contracts. Fortunately for most, this code is only used by the EVM and thus not required to be completely understood by a human. There are, however, a small cadre of individuals that understand this level of detail and can use it to their advantage. One such example is Authereum’s identification system for contracts created by the Authereum proxy factory.

The nomenclature surrounding bytecode in the Ethereum community only adds to the complexity. There are different names associated with different types of bytecode. This post does not attempt to describe the bytecode used in the EVM in detail, but rather explain the terminology used by members of the Ethereum community when referring to the bytecode of a smart contract.

The type of code described in the post is as follows:

  • Creation Bytecode
  • Runtime Bytecode
  • Bytecode
  • Deployed Bytecode
  • Init Code

Creation Bytecode

This is the code that most people are referring to when they say bytecode. This is the code that generates the runtime bytecode — it includes constructor logic and constructor parameters of a smart contract. The creation bytecode is equivalent to the input data of the transaction the creates a contract, provided the sole purpose of the transaction is to create the contract.

When you compile a contract, the creation bytecode is generated for you. A truffle-generated ABI refers to the creation bytecode as bytecode. This is also the bytecode that is shown when clicking “compilation details” for a contract on Remix.

The runtime bytecode can also be retrieved on-chain by using Solidity’s type information. The Solidity code to retrieve this code is type(ContractName).creationCode .

Creation bytecode can be retrieved off-chain by the getTransactionByHash JSON RPC call.

Runtime Bytecode

This is the code that is stored on-chain that describes a smart contract. This code does not include the constructor logic or constructor parameters of a contract, as they are not relevant to the code that was used to actually create the contract.

The runtime bytecode for a contract can be retrieved on-chain by using an assembly block and calling extcodecopy(a). The hash of the runtime bytecode is returned from extcodehash(a). This opcode was introduced with EIP 1052 and included in the Constantinople hard fork.

This code can be retrieved on-chain usingtype(ContractName).runtimeCode.

Finally, this code can be retrieved off-chain by the JSON RPC call, getCode.

Bytecode

This should be used as the umbrella term that encompasses both runtime bytecode and creation bytecode, but it is more commonly used to describe the runtime bytecode.

Deployed Bytecode

This term is used exclusively by truffle-generated ABIs and refers to a contract’s runtime bytecode. I have not seen it used outside of these files.

Init Code

This code is the same as the creation bytecode. It is the code that creates the bytecode that is stored on-chain.

This term is commonly used in articles referring the the bytecode needed when using the create2 opcode.

Conclusion

It is my opinion that the only terms that should be used are runtime bytecode and creation bytecode, as they are explicitly describing what the code is. I believe bytecode should be an umbrella term that includes both of these aforementioned term.

--

--