Fixing OpenZeppelin’s $40 Million Mistake
Over the last few years OpenZeppelin have established themselves as the go-to destination for pre-made smart contracts. If you need to do something that’s been done before, chances are OpenZeppelin has you covered.
In the Eth-dev space this is an appealing offer, since a small mistake that makes it to production can have catastrophic results. Plus they’re probably the biggest-impact open-source project in the multi-billion dollar Ethereum ecosystem besides Ethereum itself so it’s pretty tempting to just assume their stuff works and roll with it.
I’ll admit I’ve made pretty extensive use of their MerkleProof
library without once bothering to check under the hood, so I’m as guilty of this as anyone else.
In isolation this kind of behaviour isn’t a big deal. If the space were truly decentralised then any mistakes would (in theory) be fairly isolated and come out in the wash. But reality tends not to pay too much attention to maximalist Tweet threads, so natural forces tend towards centralisation (see Coinbase, Etherscan, MetaMask, OpenSea as a few examples). Specifically in this case, OpenZeppelin like selling themselves as ‘the standard for secure blockchain applications’*, a title which largely goes unquestioned.
*Source: https://www.openzeppelin.com/
Ethereum and Gas
Before we get to the juicy part that explains the click-baity title of this article, we need to touch on Gas. Anyone with any experience with Ethereum is aware of this beast. To less-technical users, it’s that random number Metamask somehow chooses and determines how expensive your transaction is going to be. But to the buidlers in the space, it’s a white whale lurking behind every line of smart contract code.
Skip to the petrol bowser emoji ⛽ if you already know how Gas works on Ethereum.
Without going too much into the nitty gritty, every operation a transaction does costs a specific amount of Gas. This cost is deterministic and based purely on the computational requirements of the specific transaction. The amount you actually pay is this number multiplied by however much you’re willing to pay for a unit of gas.
A pretty-decent analogy would be if you wanted to drive from your house to your friend’s house. In this analogy, you always drive the exact same route to visit your friend, and traffic conditions are always identical. You can calculate how much petrol it will take your car to get to the end of your street, then how much petrol it will take going around the roundabout, how much it will take getting to the next corner… etc.
The point is, in this analogy, it always uses the exact same amount of petrol to drive the same route, say 10 litres. The price difference between two different trips to visit your friend will depend on how much you pay per litre for petrol. So far so good.
The part that gets a little less intuitive is that in Ethereum, there’s a limit to how much GAS can be in a single block, and that limited supply gets auctioned off each time a block is mined. The equivalent would be if your petrol station only had 1000 litres of petrol to sell per day, and rather than just setting a price and waiting for it all to sell out, they took bids from everyone saying how much petrol they need that day, and how much they’d be willing to pay. They then look through all the offers, see who is offering the most per liter, and then sell the 1000 litres to whoever offered the most.
So if lots of people all want to visit their friends the same day as you, they will all be bidding against each other, and so the bids that get accepted will be higher than on a day when you’re the only one going for a drive. If petrol costs $1/litre, you’ll pay $10 to visit your friend, since it takes 10 litres of petrol. If gas costs $2/litre, you’ll pay $20. You can put in a bid for $15, but if everyone else is offering more you aren’t going anywhere. Easy stuff.
Same with how much an Ethereum transaction costs. If Gas costs more you pay more, even though it’s doing the same calculations. As a user, that’s the only control you have over your transaction costs, choose how much you’re willing to pay and hope there aren’t too many people offering more.
However, there is a way that a trip to your friend’s house can be cheaper. If your GPS tells you to use a different, more direct route. Now it only takes 9 litres of petrol to get there, which makes the trip 10% cheaper, even if petrol prices remain the same.
On Ethereum, the more efficiently a coder has written a smart contract, the less Gas your transaction will use. Once a contract is deployed its code can’t be changed, so you need to hope the devs did their job right the first time, otherwise you as the user will end up paying for their mistake every time you use their smart contract.
⛽ Nerds can continue reading here. You missed a lot. We had ice cream.
An Unexpected Event
As a developer I’ve been involved with NFTs for a good while. I wrote a tutorial series back in early 2018 running through how to write your own implementation of ERC-721 (the OG NFT standard), and have always rolled with my own implementations since then since it’s easier to keep it lean, and I know it works.
As such I’ve not had much exposure OpenZeppelin’s 721 implementations beyond friends saying “Can you look at my NFT contract?” and me saying “It’s OpenZepp you’re probs fine.” (foreshadowing)
But I’m also not the sort of dev to stubbornly stick to something if it’s not the right tool for the job, or the type to gild the lily on company time. So when — in the course of my work for Big Head Club — I had to build something that used NFTs with novel mechanics as well as some more vanilla NFTs, it made sense to just throw some OpenZeppelin contracts into the latter half. This would save time (since I wouldn’t need to write rigorous unit tests for that portion of the code), and also more easily satisfies the “won’t someone please think of the children” crowd who complain whenever you do anything interesting or out of the ordinary and just assume you’ve made a mistake.
Without going into too much detail, the design of the project would include transactions wherein both types of NFT would be transferred at once. One of my unit tests for this type of transaction just checked the total number of Events being emitted during the transaction.
…There were one-too-many events being emitted.
Okay, not sure how that could have happened. I dug into my code but it’s pretty clean so couldn’t really work out why it’d be emitting an extra event.
Looking at the events, there was an extra Approval
event being emitted. How the…? Looking at the source of it I realised it was coming from the OpenZepp contract not my homebrewed one. After a little more digging I realised all token transfers on the OpenZepp contract were emitting this event.
Oh shit… My stomach dropped. Are we supposed to emit Approval
with every transfer? I thought it was implicitly cleared when the token transferred.
My instinct was to assume I was in the wrong here, so went back to check my work. The ERC721 tutorial series I wrote has been read by about 7000 people, so if I’d made a mistake in that it could have affected a lot of projects. I try to be pretty rigorous with these sorts of things so it was a bit of a gut punch thinking I could have been responsible for so much shitty code out there.
Luckily, the nature of ERCs means there’s a canonical source for the ERC-721 standard, that being the EIPs repo on GitHub. Checking the dev comments above the Approval
event definition, it spelled this scenario out pretty clearly:
This emits when the approved address for an NFT is changed or reaffirmed. The zero address indicates there is no approved address.
When a Transfer event emits, this also indicates that the approved address for that NFT (if any) is reset to none.
Cool. That had been a stressful couple of hours, but it was OpenZeppelin who were in error, not me. Hmm… that might actually be worse.
I think anyone putting out code that’s supposed to match a standard should first and foremost deliver on that basic promise. If you open a pizza shop you probably wanna make sure you’re actually selling pizza and not just throwing tomatoes and cheese into an oven and calling it a day. So I’d argue even if there were no hidden costs to this mistake, giving everyone broken code and calling it the gold standard is a failure of OpenZepp’s objective in its own right. But alas, it also has some spicy financial implications.
Small Mistake, Big Consequences
When compared to an organisation like OpenZepp, the reach of my little articles is minuscule. The ubiquity of their ERC-721 implementation can’t really be overstated. A very conservative estimate would say that at least 95% of all NFTs you interact with are just the OpenZepp implementation with some minor tweaks. So a bug like this making it into their production code can have big ramifications at scale.
The cost? 2100 Gas wasted. That sounds small, especially given that an NFT transaction will usually cost you between 100,000 and 150,000 Gas.
So maybe like an extra 1–2%? So what? Just give me the NFT and shut up nerd.
But this is 2100 wasted Gas on every NFT transfer. This includes minting, trading, burning, moving between wallets. And not 2100 per transaction either, 2100 per transfer. So if you bulk-transfer you still pay the extra 2100 on each NFT. Basically, this one bug put a 2% tax on every NFT transfer, paid by NFT users, paid to Ethereum miners. If you’ve ever held an NFT, odds are you’ve paid for this mistake.
Given how absolutely insane NFT usage has been over the last few years, I figured the total cost could probably add up to a pretty big number. It’s also worth noting that a lot of the biggest gas wars in that time have been surrounding NFT drops. The cost of that 2100 Gas is a lot higher during those times.
Initially after a back-of-the-envelope calculation I estimated the damage would have been around $100 million. If you read the title of the article you’ll know I’ve revised that down, but it was definitely a “holy moly” kinda moment.
Calculating the Damage
Using data from messari.io, I was able to get what I think was a good estimate of the cost. I was able to grab the average Gas price of transactions per-day, as well as the number of ERC-721 transfers that occurred on that day. Given that the wasted gas is a flat 2100 per transfer, and assuming 95% of transfers are for OpenZepp contracts, that comes out to:
Wasted ETH per day = 2100 * ERC721 Transfers * Avg Gas price * 0.95
Then to calculate the real world cost, just multiply that number by the ETH price at the time (historical data for which messari.io also provides). It’s then just a case of adding the cost for each day and getting a total. It’s possible this doesn’t properly catch those gas spikes, or that there are other problems with my method which is why I’m putting it here for smarter people to tear it apart.
The earliest date with data for ERC-721 transfers was 20/11/2017
, and the latest date I included was 9/07/2022
.
Using this method, the total cost of the bug was $41,389,690.72, or about $40 million (rounding down for nicer number). No matter how you slice it, this is pretty spicy. It’s the same impact as some of the bigger DeFi hacks we’ve seen, except without all the sexy, news-worthy appeal.
Actually if anything it’s more comparable to the recent Solana hack that took money from a bunch of small wallets, except in this case the people responsible were supposedly the good guys.
Righting the Wrong
I’ll admit, after the stress caused by thinking my code was trash, and discovering that it was actually smug-seeming OpenZeppelin at fault I was more than a little annoyed. We all make mistakes, but we don’t all call ourselves the best implementation of a standard used for un-changable code, so some of us have a higher bar to clear than others.
OpenZepp is, as the name suggests, open source, so I opened a pull request with the one-line change to the code needed to fix the mistake. I was a little surprised to see that this change actually caused a few of their tests to fail.
On the one hand, it’s good that they have rigorous tests in place, and that this bug wasn’t just an oversight. But it was all the more jarring that the mistake had been reinforced by incorrect tests, rather than discovered and rectified.
There was a little push-back in the comments, with one suggestion that since OpenZeppelin is so widely used, maybe changing this might break things in the wild (ie, “we’re too big to fail”), and another pointing out that additional functionality can be added to a token contract without it being considered broken. The latter is technically true, but not really the point if you’re selling a clean and correct 721 implementation.
To their credit, when I pointed out the cost of the mistake, they agreed it should be fixed. I’m not sure what it says about the organisation that if the cost had been a few digits smaller this PR probably would have just fizzled out as an argument about semantics and the bug never would have been removed.
Lessons Learned
To be honest, I’m not really sure what the lesson is here, except for maybe don’t trust some random implementation no matter how much clout the group who wrote it has.
There’s something to be said about Skin in the Game, where the perpetrators of the mistake weren’t necessarily the ones who paid for it. If OpenZepp’s users are devs, then there’s at least 2 degrees of separation between whoever created this bug, and whoever ends up out of pocket. In this regard, OpenZepp’s incentives are more aligned with presenting themselves as a group with code that works well, rather than with actually writing good code.
In a space like crypto, where understanding and balancing incentives runs as such a core imperative, this sort of misalignment could be a sign of deeper issues in the power structures that have naturally grown.
Then again, it could just be a single line of code that shouldn’t have been there. Either way, the latter is fixed now.