Type safety and RNGs
This article is for people writing Bitcoin wallets. It examines a few recent crypto failures and looks at how we can improve safety in Bitcoin wallet construction.
Too many of us are using dangerous tools: I hope by the end we will all have ideas for how to increase the safety of our programming.
In the last few days the Bitcoin community suffered another wave of stolen coins due to a failure to correctly generate random numbers. The bug in the blockchain.info “My Wallet” web wallet was live only for a few hours, but that was enough time to compromise around 250 BTC (or around $90,000). Luckily they were mostly taken by a Robin Hood — people who scan the block chain for crypto failures and immediately steal any money exposed by them, so they can be returned to the legitimate owners later. My Wallet users got lucky and it looks like most of them will get their money back.
var a = new Array();
a[b++] = 1;
a is now  and no exception is thrown — the write to the array simply vanishes. Try it out in your browser console if you don’t believe me.
Any reasonable language would not allow the above code to run at all: attempting to increment “undefined” should crash. A language that tries to be as dynamic as possible might cast it to zero and then yield one. But doing nothing is the worst of all possible choices.
There are two underlying problems here. One is the existence in My Wallet of this wrapper around the browser’s crypto RNG — no such wrapper should exist and My Wallet should refuse to run if the browser doesn’t support the necessary features. Doing anything else is incredibly risky.
The cause was a buggy refactoring that created type confusion in the signing code. Here is the commit that fixed it. It adds type assertions to the start of one of the crypto routines. A stricter language would have had these by default.
- Use the Closure compiler from Google and annotate your code with type comments. This technique can be used to incrementally upgrade an existing codebase. The compiler will catch many errors, including (I think) use of uninitialised variables. As a bonus, it can optimise your code, but the main reason to use it is correctness.
Also, consider the use of deterministic ECDSA and deterministic wallets. When implemented properly these technologies mean a weak RNG can only break newly created wallets, not all wallets that happen to be used within the timeframe of the mistake.
When I have time, I want to investigate use of the Checker Framework for my own Java projects where upgrading to a more modern language isn’t feasible.
Not all Bitcoin crypto failures have been due to type safety failures. In 2013 we discovered that the crypto RNG provided by Android was fundamentally broken and always had been (albeit in different ways). Amazingly, crypto on Android had never worked …. whilst looking a lot like it did.
The root cause was the similar to the My Wallet failure: layering of buggy extra random number generators on top of the kernel crypto RNG. You should never do this. Let’s review a bit of randomness theory to see why not.
Computers are predictable, completely deterministic machines. So to generate random numbers they must derive them from a source of entropy (a fancy word for a simple idea). Usually this means the environment and especially the unpredictable humans that surround the computer. Ultimately all entropy in the computer comes from observations about its environment, measured and aggregated over a long period of time.
Example sources of entropy in a computer include: key press timings, mouse or trackpad movements, arrival of network packets, radio noise, and sometimes dedicated hardware RNGs that exploit physics-based phenomena.
All input in your computer or mobile device comes through hardware, and hardware access is mediated by the kernel. So all entropy ultimately comes from the kernel. The people who write kernels know this and that’s why one of the core services a kernel provides is a cryptographically strong random number generator. Linux provides one, MacOS provides one, Windows provides one.
The people who write your operating system kernel are very likely to be better programmers than you. Attempting to improve on their randomness by layering on a random number generator that runs outside the kernel might introduce fatal and undetectable errors, as has happened with blockchain and Android, but one thing it definitely won’t do is improve your randomness. Older apps are often written with a userspace RNG because historically calling into the kernel was slow, but this hasn’t been true for a long time now and anyway, we’re mostly working in higher level languages that impose far more overhead than syscalls. So don’t do it.
Via the bitcoinj library, I am the author of quite widely used Bitcoin crypto code. This fact sometimes keeps me awake at night. We recently integrated a piece of code that (on Android only) swaps out the userspace RNG for one that just reads directly from the kernel via /dev/urandom. It’s been used by the main Android wallets for about a year now so it’s well tested; we’re integrating it mostly so new wallet devs can’t forget to use it. But I’m thinking about doing this for any non-Windows platform, not only on Android. Although the Java SE userspace random number generator isn’t very likely to be bugged, it can’t help us either and there may be other Java reimplementations out there with problems.