Deep Dive into abi.encode: Types, Padding, and Disassembly

scourgedev.eth
5 min readOct 18, 2023

Table of Contents

  1. Introduction
  2. ABI Encoding
  3. Revert Scenarios
  4. Final Words

Introduction

ABI encoding involves converting static types, user-defined types, dynamic types like strings, and arrays into bytes through serialization. This is integral in the Ethereum ecosystem to ensure the seamless interaction between contracts. The serialization adheres to the Ethereum ABI specifications, a critical standard that ensures that data, regardless of its type, is consistently formatted and transmitted efficiently.

ABI Encoding

The function signature of abi.encode is defined as abi.encode(...) returns (bytes memory), which signifies that the function takes an arbitrary amount of arguments of various types and returns the encoded data as bytes in memory.

Encoding of Static Types

abi.encode is designed to handle most of the static types in Solidity such as address, uint256, or bytes32, each being encoded as 32-byte words. The padding of bytes is determined by the underlying Solidity types which are being encoded. For instance:

  • address and other static types less than 32 bytes such as uint8 are zero-padded on the left side. For example:
abi.encode(0xe592427a0aece92de3edee1f18e0157c05861564) 
= 0x000000000000000000000000e592427a0aece92de3edee1f18e0157c05861564
  • Fixed-size byte values (like bytes4, bytes8, bytes12, etc.) are zero-padded on the right side. For example:
abi.encode(0xabcdef12) 
= 0xabcdef1200000000000000000000000000000000000000000000000000000000

Some Solidity types are not supported by the ABI, but they are able to be represented by the static types mentioned above.

Chart showing complex type representation from Solidity Docs

Encoding Dynamic Types

Dynamic types such as strings, bytes, and arrays require a more nuanced approach for encoding due to their variable size. The encoding format is as follows:

  1. Offset: The first 32-byte word indicates the bytes index at which the data starts.
  2. Length: The second 32-byte word indicates the data’s length, which varies among different dynamic types. It represents the number of bytes in data for strings and bytes, and the number of elements in an array for array types.
  3. Data: The following series of 32-byte words encapsulate the actual data. Every 32-byte word adheres to the padding rules of static types; specifically, strings and bytes are right-padded, while array elements of static types smaller than 32 bytes are left-padded

For strings, each character is encoded in UTF-8, and each byte corresponds to the hex notation of a character. Other dynamic types like arrays follow a similar encoding pattern, with each element padded to 32 bytes.

Below is a demonstration of dynamic type encoding by running abi.encode("Hello World"):

// The function above will return the following raw bytes value.
0x0000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000000b48656c6c6f20576f726c64000000000000000000000000000000000000000000

// We can split this into words that are 32 bytes long to get:
0x0000000000000000000000000000000000000000000000000000000000000020 // offset
000000000000000000000000000000000000000000000000000000000000000b // length
48656c6c6f20576f726c64000000000000000000000000000000000000000000 // string

As shown above, the offset corresponds to where the string begins, if there were other arguments encoded ahead of the dynamic type, the offset would increase. The length corresponds to the number of characters are in the string (11), and the string is "Hello World" right-padded, UTF-8 encoded into bytes.

In the scenario where an empty string, empty bytes, or an empty array is encountered, how does the system behave? In such instances, the offset remains unchanged, the data length is 0, and the encoded bytes contain no data portion. However, an issue was recently resolved in Solidity version 0.8.15. Previously, encoding an empty string from storage appended an empty 32-byte data portion to the encoded bytes.

Note: Mappings and reference types with location set to storage can not be encoded or be passed into contract public or external function parameters. The formal specification of encoding can be found here.

Revert Scenarios

To learn when abi.encode reverts, rattle can be used to generate a simplified SSA/infinite register form from the raw bytecode of a contract.

The first contract to disassemble from bytecode is for static type encoding:

pragma solidity 0.8.17;

contract Test {
function testSimpleEncode() external pure {
abi.encode(uint8(100));
}
}
Disassembled abi.encode for static typing

The diagram illustrates that no revert scenarios exist. This observation holds true for all static types, stemming from the fact that the abi.encode process is uniformly applied to all, ensuring consistent behavior and outcomes.

The second contract to disassemble from bytecode is for dynamic type encoding, for the example, a string is used:

pragma solidity 0.8.17;

contract Test {
function testSimpleEncodeString() external pure {
abi.encode("Solidity");
}
}
Disassembled abi.encode for dynamic typing

As depicted in the diagram, no revert scenarios are present, a consistency observed across all dynamic types. This is due to the abi.encode process being similar for these types. However, the encoding of bytes have a slight variation, and arrays are processed using a loop.

Final Words

Thank you for taking the time to explore the different types thatabi.encode supports, the method of padding and encoding for the different types, and finding revert scenarios through disassembly from bytecode.

This article is part of a larger series of articles diving deep into ABI encoding and decoding. The next article in this series is about exploring the disassembled form of abi.encodePacked, its use cases, and security considerations; which is available [COMING SOON]. See the rest of the articles here.

References

  1. “Contracts.” Contracts — Solidity 0.8.22 documentation. Accessed October 5, 2023. https://docs.soliditylang.org/en/latest/contracts.html#return-variables.
  2. “Contract Abi Specification.” Contract ABI Specification — Solidity 0.8.22 documentation. Accessed October 6, 2023. https://docs.soliditylang.org/en/develop/abi-spec.html.
  3. Cvllr, Jean. “Solidity Tutorial: All about ABI.” Medium, April 5, 2022. https://coinsbench.com/solidity-tutorial-all-about-abi-46da8b517e7.
  4. iamoracle. “Exploring Solidity Low-Level Features — ABI Encoding and Opcodes.” Celo Academy, April 28, 2023. https://celo.academy/t/exploring-solidity-low-level-features-abi-encoding-and-opcodes/109.
  5. “Solidity Changelog.” Github. Accessed October 16, 2023. https://github.com/ethereum/solidity/blob/develop/Changelog.md.

--

--

scourgedev.eth

Team Lead | Blockchain Full Stack Developer | Smart Contract Developer | Currently Diving Deep Into Smart Contract Security