Audius Variant Scanner: Scanning Storage Collisions between Ethereum Contracts

Published in

ChainLight Blog & Research

8 min readMar 21, 2023

Summary

This article provides a brief description of the Audius hack that occurred last year, as well as a project that investigated the potential for the same event to occur on the entire Ethereum network.

Background

What is Audius?

Audius is a decentralized music streaming platform that allows anyone to upload and listen to music and create a decentralized ecosystem through token rewards. Artists who sell their music through Audius can minimize the process of monetizing their music and maximize their profits compared to traditional music distribution platforms.

Audius aims to improve the current problem of profit distribution in the music distribution market, allowing artists to take the profits that intermediaries have traditionally taken and share profits with their fans.

As of March 2023, Audius has more than 6.2 million monthly active users, and about 30% of the 1 billion $AUDIO tokens have been staked based on the token distribution plan.

Audius hack incident

On July 24, 2022, a governance proposal to transfer $18 million worth of $AUDIO tokens to the attacker’s wallet was passed due to a bug in Audius’s governance, staking, and delegation contract.

The attacker exploited a bug in the governance contract to steal funds that were stored in the contract. They were able to execute an erroneous delegation without changing the token supply, allowing them to use 10T fraudulent $AUDIO tokens to pass a governance proposal. As a result, the attacker successfully transferred $18 million worth of $AUDIO tokens to their own wallet.

The incident occurred due to a “storage collision” bug that can occur in the Solidity language. After the incident, ChainLight investigated if there were any other variants of storage collisions on the Ethereum network. To conduct the experiment, we utilized BigQuery, a local Geth node, and in-block static analysis at the EVM level.

Storage Collision Causes and Solutions

Storage Collision bug occurs in ≥2 contracts when two functions in two contracts share the same storage slot with different storage layouts. Avoiding storage collisions in EVM is important to ensure the accuracy and security of smart contracts. Storage Collisions can cause unexpected behavior, financial loss, or other negative outcomes.
EVM-level static analysis refers to the process of analyzing Ethereum smart contracts at the EVM level. By analyzing EVM bytecode, ChainLight was able to identify potential vulnerabilities and avoid future incidents.
EVM-level static analysis is important because it ensures accuracy and security when deploying smart contracts on the blockchain. By detecting potential issues early in the development process, developers can reduce the risk of high-risk bugs or security vulnerabilities.

As mentioned in the concept above, a storage collision bug occurs in the following scenario. For example, if you use the code below in the Proxy-Implementation pattern, the positions of _implementation and _owner will overlap.

contract Proxy {
  address _implementation;
  function fallback() { _implementation.delegatecall(msg.data); }
}

contract Implementation {
  address _owner;
  mapping(address -> uint256) _balances;
  uint256 _supply;
  ...
}

In the code above, storage is allocated as below. (Reference: OpenZeppelin)

|Proxy                     |Implementation           |
|--------------------------|-------------------------|
|address _implementation   |address _owner           | <= [!]
|...                       |mapping _balances        |
|                          |uint256 _supply          |
|                          |...                      |

With the storage layout above, when _owner is specified, the _implementation of Proxy also changes, which breaks the contract.

A pronouncing example of a storage collision problem occurs when the Implementation contract uses Initializable from old versions of OpenZeppelin contracts before v4.5.0.

contract Initializable {
  bool _initialized, _initializing;
  modifier initializer {}
}

contract MyContract is Initializable {
  function init(uint256 param1, ...) initializer { ... }
}

From the code above, you can see that _initialized and _initializing occupy 2 bytes of the first slot. If MyContract is connected to the Proxy contract, these variables will also overlap with _admin. Therefore, immediately after the Proxy and Implementation are connected, a non-zero value is already stored in the space occupied by the _initializing variable, so it always passes through the initializer modifier.

This pattern of vulnerabilities has been found in several Ethereum projects, including Audius, and has also led to a large amount of TVL leaks. Therefore, you need to pay special attention to the layout of the storage variables to avoid storage collisions.

The best practice to prevent this vulnerability is to place storage variables that can be separated from business logic, such as Proxy’s _implementation variable, in slots that are not normally used. If you store the proxy’s data in a slot that does not collide with the contract that is the delegatecall target, you can block the storage collision vulnerability at the source. (Example: a slot index like keccak("my_id")-1 is not used by Solidity)

Although Solidity 0.8.x version does not support arbitrary slot selection for storage variables and requires the use of assembly, the use of OpenZeppelin’s EIP-1967 Proxy implementation can reduce the effort required to write it.

Furthermore, static analysis tools such as Slither provide Upgradability Checks tools. By specifying the first version of the contract, the proxy, and the contract to be upgraded, it informs whether there is a risk of storage collision when using the proxy and upgrading.

Experiment: On-Chain Scanning

ChainLight examined the contracts across the chain to see if other variants of bugs are present on the Ethereum mainnet. The experiment was carried out in the following process:

Find DELEGATECALL contract pairs (BigQuery)
Analyze the storage layout of each contract (EVM opcode analysis)
Detect storage collisions

Find `DELEGATECALL` contract pairs

How can we scan the entire blockchain? First, we need to know which contract to check. To achieve this, we examined the entire history of internal transactions that contain DELEGATECALLs to other contracts. After collecting delegatecall pairs, we can scan both contracts and compare the storage layouts between them.

Google’s BigQuery allows you to query data related to the Ethereum chain using SQL queries. By querying all DELEGATECALL transactions that happened up to this point¹, we got 6.94 GB of address pairs².

SELECT from_address, to_address FROM `bigquery-public-data.crypto_ethereum.traces`
WHERE call_type IN ('delegatecall', 'callcode')
GROUP BY from_address, to_address

To download results larger than 1 GB from BigQuery, you must export the results to Google Cloud Storage. You need to set up a payment method, but if you clear the results after downloading, the actual charge will be less than $1.

1: Aug 1, 2022
2: Google BigQuery has been providing Ethereum public datasets since 2018. The dataset allows you to query various on-chain information such as balance of each address and token transactions. In this experiment, we used the “traces” table with internal tx of each transaction.
N.B. However, BigQuery is free up to 1 TB of storage read operations per month. After executing a query, you can see the estimated processing amount in the upper right corner without actually running it if you wait for a moment.

Analysis of each contract’s Storage layout

Projects using the Ethereum chain usually distribute source code written in Solidity, but this is not always true. To cover all contracts, we performed analysis on the EVM bytecode level instead of the Solidity level.

We performed a simple emulation-based analysis on the EVM bytecode to identify the storage slots of all variables used in each contract for the Proxy and Implementation pair.

We modified etk-dasm to get a list of basic blocks and then symbolically executed each block to collect all storage loading instructions inside the contract. This way, we were able to find storage variables in fixed slots.

The simplest case looks like this:

PUSH  0x00 # slot
SLOAD      # load storage slot
           # This equals to: contract MyContract { uint256 a; }

The above opcode loads a 256-bit integer from storage slot 0x0. Additionally, to implement smaller data types, Solidity performs division on the loaded value and then performs bitwise AND operations.

PUSH (1 << 8) - 1 # least; 8 bits
PUSH (1 << 224)   # most; 32 bits (256 - 32)

PUSH  0x00        # slot
SLOAD             # load storage slot

DIV            # extracts the most significant 32 bits
AND            # extracts the least significant 8 bits
               # This equals to: contract { uint248 a; uint8 b; }

Therefore, we need to record how the storage values are handled in each SLOAD, DIV, and AND operation. Solidity also optimizes constants like 2²²⁴ to reduce the size of code. It is done by splitting large constants into multiple arithmetic operations. Below is an example of optimization in Solidity.

PUSH (1 << 224) - 1: 29 bytes
->
PUSH 1   :   2
PUSH 1   : + 2
PUSH 224 : + 2
SHR      : + 1
SUB      : + 1 = 6 bytes

To fold these five opcodes to one integer, we implemented a basic constant folding: when emulating arithmetic opcodes, the result of the operation is pushed instead.

The whole analysis yields a byte-level layout of all storage variables³.

Proxy:
  Slot 0: [1 x 20] (address: uint160)

Implementation:
  Slot 0: [1 x 1, 2 x 1] (uint8, uint8)

3: While mapping variables take up slot spaces, they don’t store actual values to the slot. As a result, they’re currently undetected.

Detect storage collisions

By (1) collecting the delegatecall pairs and (2) analyzing each contract, (3) we could compare storage layouts between proxy and implementation. As a result, it was possible to check whether there were conflicting storage variables between contracts.

Conclusion

As a result of the experiment, we identified 180 pairs of weak addresses, which can be broadly categorized into three groups. In most cases, either someone had already conducted a similar experiment before us, or the projects had received appropriate reports:

Contracts that were vulnerable in the past but are not affected by the vulnerability now. These contracts may have been deleted (SELFDESTRUCT) or replaced by another contract. Examples of such contracts include Audius and xToken Terminal.
Contracts that have storage collisions but have an extra layer of defense, effectively mitigating the collision, for example, FLASH token.
Contracts that have storage collisions but aren’t actively used.

Using upgradable contracts in Web3 and Decentralized Finance (DeFi) projects allows for patching vulnerabilities and improving protocols in an originally immutable blockchain. However, there is a risk of vulnerabilities, such as Storage Collisions, which can lead to unintended consequences as seen in the Audius incident. Therefore, it is essential to use verified implementations such as OpenZeppelin code and conduct audits to ensure there are no vulnerabilities arising from storage sharing.

Reference

✨ We are ChainLight!

ChainLight explores new and effective blockchain security technologies with rich practical experience and deep technical understanding. Our innovative security audits built upon such research proactively identify and eliminate various security risks and vulnerabilities in the Web3 ecosystem. To ensure continuous security even after the audit, we provide a digital asset risk management solution using on-chain data monitoring and automated vulnerability detection services.

ChainLight serves to guide and protect all users of decentralized services, lighting the way for a safer Web3 ecosystem.

Want to see more from the ChainLight? 👉 Check out our Twitter account.

🌐 Website: chainlight.io | 📩 TG: @chainlight | 📧 chainlight@theori.io