What happens when you try to fork an Ethereum token?

Edmund Edgar
8 min readFeb 1, 2017

It’s easy to create a token on the Ethereum network. Forking one is also, in theory, a simple business. A token typically consists of a database of balances, some code showing how to access it and, potentially, some other code that does other stuff. You simply make your new contract, publish it to the blockchain, copy the data from the old contract over to it, publish its address and ask people to use it.

Whether you have copied the balances correctly is something that is verified in the social space, not the code space; If somebody notices a mismatch, they’ll tell their friends and — if its to their disadvantage — everyone else, and people will (hopefully) decide not to move to your contract. If the reason for forking is not universally shared, you may end up with what we had in the ETC vs ETH fork: Two rival tokens, continuing to operate, starting with the same balances at the point where the new token was created but with divergent balances thereafter. The value of the two will be decided by the market. As with ETC vs ETH, people who held the tokens at the point where you forked will now have their value split among two different tokens.

The ability to do this provides security to Ethereum tokens. If there’s a something wrong with the old token, you can fix it and move over to the new one. This applies to bugs in the computer code, as we saw with the HackerGold token, which had to be redeployed shortly after its initial release. But as these tokens come to represent powerful socially-driven applications, it can also help with social failures like 51% attacks by voting stakeholders, or exploitation of attack vectors in Schelling games. In these latter cases, not only does the ability to fork provide us with an escape hatch if something goes wrong, it may also prevent that thing going wrong in the first place; There’s no point spending money and time subverting a contract to take control of its tokens only for it to fork, leaving you with worthless crypto-beans.

One potential hitch here is that the loading of data into a new contract involves a gas cost, and that cost grows with the number of balances. This is particularly serious where the fork is supposed to be acting as a game-theoretical deterrent, because attackers might take advantage of times when the gas price was low to create a lot of data and make it deliberately expensive to recreate. However, it may turn out to be socially acceptable to simply leave low-value and obviously malicious data behind. Alternatively, forkers can move the cost to the owners of the low-value tokens, by publishing only a merkle hash of their balances, and require users to send a transaction, with a merkle proof, to “defrost” their tokens before they can be used.

How do we tell the robots?

By simply publishing the data to a new token and sharing its address, you can fork a token in a way that works simply and usefully for people who send it to each other. However, the ability to create and trade tokens is only a small part of the usefulness of tokens in Ethereum. What is just as important is the ability of other contracts to manage those tokens.

Want to put some money in escrow pending the completion of a task? Hand it over to a contract. Want to make a bet? Make a contract. It can hold the tokens on its users’ behalf and give them to the winner when the bet is settled. Want to add a feature that the contract didn’t originally have, like zcash-style anonymity? You can wrap a proxy contract around the token and interact with that instead. Want to exchange tokens without a trusted party? Deposit your coins with a contract like Etherdelta and let it handle the exchange for you.

So what happens to these contracts when we fork a token? Well, the money is still there, and under their control. The list of balances that was uploaded to the new contract doesn’t discriminate between humans and contracts. Instead of the human having the new tokens, the contract now has the forked tokens. It can ask the new token for its balance, and it will see it right there.

Anybody here missing a billion ETH-DOGE?

But there’s a problem. The contract knows that it has a billion forked tokens under its control, but that’s not enough: It also needs to know who or what those tokens are supposed to belong to. When a contract holds tokens on behalf of somebody, it doesn’t tell the token contract about it; It gets permission from the owner of the tokens to transfer them to its own balance, but after that, as far as the token contract is concerned, they belong to the contract, not the owner. The same happens when the tokens are not dedicated to a particular user, but are instead assigned to a bet, or some other set of conditions. All this information is held in the data storage of the contract that manages the tokens, not the contract that defines the token.

Things have the potential to go quite badly wrong if a significant token needs to fork. The token itself will work fine, but contracts managing the tokens won’t know how to handle it.

Solutions, solutions

So what do we do? Maybe we could fork the calling contract as well, and make a new contract that knows the appropriate balances. For proxy tokens that wrap a single token, this might be the right thing to do. But that also requires you to fork any other contracts that are referenced by the contract, then any contracts referencing those would need to be updated too, potentially rippling outwards until we were redeploying virtually everything on the network.

We might decide that from the point of the fork onwards, any instance of Buggy-Coin will instead be settled in Fixed-Coin and nobody can withdraw Buggy-Coin any more. But Buggy-Coin may still be worth something, and in any case the contract may have a hard time making this decision trustlessly, especially when it holds tokens that will potentially be paid out to multiple people.

Another approach is for the contract to copy all its pre-fork balances over to new data representing the new token. Anyone who previously had Buggy-Coin is also credited with Fixed-Coin. But it’ll have to be quick about it; the moment the fork has occurred, the histories of the two tokens will start to diverge; If somebody withdraws their Buggy-Coins, they’ll be very unimpressed to later learn that they have now lost their Fixed-Coins because the contract hadn’t been updated in time.

What we really want is for the contract to be able to go back in time to the moment of the fork, and get the balances from then. But although somebody with access to the blockchain can look up any data stored by a contract as of any past block, this is not something that contracts can do; If they were able to do that, you would need the entire history of the chain to validate a block, but Ethereum is designed so that you can validate a block with only the current database state.

Somehow we need to be able to get hold of information about the balances controlled by contracts, at the point when the fork occurred. There are two obvious options. Firstly, we can change the way tokens work so that when a contract manages coins on behalf of another account or set of conditions, this data is held in the original token contract. But this would require changing, and complicating, a simple, successful and well-deployed standard.

So we’ll have to get this information from the forked contract. But as we discussed above, information about tokens managed by another contract is not usually something that a token contract possesses.

So how do we do it?

First, the calling contract needs to somehow let the people making the fork know that they need the forked contract to be able to give them this information. This doesn’t necessarily need to happen on the blockchain; The person who created a decentralized exchange contract could send the forkers an email. But a nice, clean way to do it is for their contract to raise an event saying they want information about any fork of token X, and what information they need. That way the forkers can just scan the blockchain for relevant transaction logs.

event LogForkTokenSubscribe(address token, string subscribe_data)

This needs to be called once, the first time the contract decides it will be handling a token that may in future be forked. The “subscribe_data” part tells the forkers which data from their contract they want to be able to retrieve. Let’s imagine we manage tokens in a decentralized exchange that stores the balances in a mapping called “token_balances”. It can do:

LogForkTokenSubscribe(token, “token_balances”)

Secondly, the calling contract needs a function that can accept information from the forked token about its balances. The simple case looks like this:

function creditForkedTokens(address user, uint256 balance) {
token_balances[msg.sender][user] += balance;
}

Let’s call a contract that implements this `ForkSavvy` and define it in an API stub that the forking token contract can reference:

contract ForkSavvy {
function creditForkedTokens(address user, uint256 balance);
}

Now the forked token contract can do something like this:

// Balances controlled by other contracts when we forked
mapping (address=>mapping(address=>uint256)) managed_balances;
function creditForkedTokensTo(address con, address user) {
managed_balances[con][user] = 0;
ForkSavvy(con).creditForkedTokens(
user,
managed_balances[con][user]
);
}

This function can be called by the people creating the fork; Doing thisfor their users provides the greatest possible convenience to them, and maximizes the chances that people will use their fork. Alternatively they can leave it to users of contracts who want to claim their funds.

As when we discussed recreating initial balances, if there is a lot of low-value or spam-like data in the original token, the forkers may wish to leave the full set of managed balances out of their initial deployment, and instead provide a merkle root so that other people can add it later, paying the gas cost to “defrost” their balances.

Sometimes the tokens managed by a contract belong to a bet not a user, and they are managed with a 32-byte ID (usually a hash of their content) rather than an address. So let’s have a way for the forked contract give them a 32-byte ID, too:

function creditForkedTokens(bytes32 data_key, uint256 balance);

Now the contract can know that a bet that pays out in Buggy-Coin should also be able to pay out in Fixed-Coin. Depending how it works it may simply duplicate the bet, or it may leave the bet as it is but be ready to pay out whichever token the user requests when they come to withdraw.

The good news is that making everything described here work doesn’t require any changes to tokens until they fork. But we do need decentralized exchanges, and other contracts that manage tokens on people’s behalf, to be ready for this before it happens.

--

--