Type safety and RNGs

This article is for people writing Bitcoin wallets. It examines a few recent crypto failures and looks at how we can improve safety in Bitcoin wallet construction.

Too many of us are using dangerous tools: I hope by the end we will all have ideas for how to increase the safety of our programming.

In the last few days the Bitcoin community suffered another wave of stolen coins due to a failure to correctly generate random numbers. The bug in the blockchain.info “My Wallet” web wallet was live only for a few hours, but that was enough time to compromise around 250 BTC (or around $90,000). Luckily they were mostly taken by a Robin Hood — people who scan the block chain for crypto failures and immediately steal any money exposed by them, so they can be returned to the legitimate owners later. My Wallet users got lucky and it looks like most of them will get their money back.

The underlying bug was a failure to initialise a variable to zero in an application level RNG. “My Wallet” is not written in C, as you might expect having read that statement. It’s written in Javascript, a supposedly modern scripting language. The variable rng_pptr is used as an index into an array, and when it was left uninitialised, we get this astonishing outcome:

var a = new Array();
var b;
a[b++] = 1;

a is now [] and no exception is thrown — the write to the array simply vanishes. Try it out in your browser console if you don’t believe me.

It’s not news that Javascript is full of surprising quirks, but this demonstration of how dangerous the language can be is still impressive. Perhaps only PHP with its notoriously broken == operator is in the same ball park.

Any reasonable language would not allow the above code to run at all: attempting to increment “undefined” should crash. A language that tries to be as dynamic as possible might cast it to zero and then yield one. But doing nothing is the worst of all possible choices.

There are two underlying problems here. One is the existence in My Wallet of this wrapper around the browser’s crypto RNG — no such wrapper should exist and My Wallet should refuse to run if the browser doesn’t support the necessary features. Doing anything else is incredibly risky.

The other is lack of type safety in the language. Type systems got a bad rap with a lot of programmers because languages with stronger type systems were often very verbose, or had other design choices that made them hard to use. Modern languages with modern type systems like Scala, Kotlin, C# and especially Haskell have changed this: they make heavy use of local type inference to cut down on tedious repetition, and can visually resemble languages like Javascript and Python. They have lots of functional constructs to cut down on verbosity yet further. And because the compilers run just in time, it means you can iterate very fast on changes. In short, the reasons for avoiding stricter languages are falling away over time and the benefits are getting larger.

Let’s look at another place where the lack of type safety in Javascript led to theft of Bitcoins. In April 2014 CounterWallet, a web wallet, upgraded to a new version bitcoinjs-lib that contained a type safety error that resulted in re-used ECDSA “k” values — doing this always immediately compromises the private key. Lots of addresses were exposed.

The cause was a buggy refactoring that created type confusion in the signing code. Here is the commit that fixed it. It adds type assertions to the start of one of the crypto routines. A stricter language would have had these by default.

Sometimes people need to use Javascript because they’re working in the browser. Browsers have a number of problems that make it difficult to build wallets, but I’m not going to try and convince you to stop making them here. Suffice it to say there are alternatives for writing cross platform wallets you could consider.

But if you must or will use Javascript please either:

  1. Use the Closure compiler from Google and annotate your code with type comments. This technique can be used to incrementally upgrade an existing codebase. The compiler will catch many errors, including (I think) use of uninitialised variables. As a bonus, it can optimise your code, but the main reason to use it is correctness.
  2. Use a language that has a Javascript compiler backend, so you can write type safe code in a stricter language and then compile it down to Javascript. If you’re starting a new web wallet from scratch, you should seriously consider this path. Some languages like this are Java, Scala, Clojure, Kotlin, TypeScript etc.

Also, consider the use of deterministic ECDSA and deterministic wallets. When implemented properly these technologies mean a weak RNG can only break newly created wallets, not all wallets that happen to be used within the timeframe of the mistake.

A quick word about Java. The Java type system is not terrible but it’s not great either. Like with Javascript, we’re often forced to use Java or something that interops with it because of platform requirements (Android). But Java’s type system is extendable with annotations. The Checker Framework is an interesting project that adds a series of new optional, backwards compatible type annotations to Java which can detect things like thread safety issues, immutability violations, even mixing of units like seconds and milliseconds.

When I have time, I want to investigate use of the Checker Framework for my own Java projects where upgrading to a more modern language isn’t feasible.

Not all Bitcoin crypto failures have been due to type safety failures. In 2013 we discovered that the crypto RNG provided by Android was fundamentally broken and always had been (albeit in different ways). Amazingly, crypto on Android had never worked …. whilst looking a lot like it did.

The root cause was the similar to the My Wallet failure: layering of buggy extra random number generators on top of the kernel crypto RNG. You should never do this. Let’s review a bit of randomness theory to see why not.

Computers are predictable, completely deterministic machines. So to generate random numbers they must derive them from a source of entropy (a fancy word for a simple idea). Usually this means the environment and especially the unpredictable humans that surround the computer. Ultimately all entropy in the computer comes from observations about its environment, measured and aggregated over a long period of time.

Example sources of entropy in a computer include: key press timings, mouse or trackpad movements, arrival of network packets, radio noise, and sometimes dedicated hardware RNGs that exploit physics-based phenomena.

All input in your computer or mobile device comes through hardware, and hardware access is mediated by the kernel. So all entropy ultimately comes from the kernel. The people who write kernels know this and that’s why one of the core services a kernel provides is a cryptographically strong random number generator. Linux provides one, MacOS provides one, Windows provides one.

The people who write your operating system kernel are very likely to be better programmers than you. Attempting to improve on their randomness by layering on a random number generator that runs outside the kernel might introduce fatal and undetectable errors, as has happened with blockchain and Android, but one thing it definitely won’t do is improve your randomness. Older apps are often written with a userspace RNG because historically calling into the kernel was slow, but this hasn’t been true for a long time now and anyway, we’re mostly working in higher level languages that impose far more overhead than syscalls. So don’t do it.

Via the bitcoinj library, I am the author of quite widely used Bitcoin crypto code. This fact sometimes keeps me awake at night. We recently integrated a piece of code that (on Android only) swaps out the userspace RNG for one that just reads directly from the kernel via /dev/urandom. It’s been used by the main Android wallets for about a year now so it’s well tested; we’re integrating it mostly so new wallet devs can’t forget to use it. But I’m thinking about doing this for any non-Windows platform, not only on Android. Although the Java SE userspace random number generator isn’t very likely to be bugged, it can’t help us either and there may be other Java reimplementations out there with problems.