‘Random’ numbers on the Arduino

6 min readOct 8, 2017

TL;DR — Random numbers on the arduino generally aren’t. They use a fully deterministic algorithm. Even when a device is reset, it will generate precisely the same random numbers every time.

Update — I have published some proposed fixes here, along with example code for people to experiment with: https://medium.com/@LargeCardinal/random-numbers-on-the-arduino-97f325820284

Soundbite — Coveyou’s “Random Number Generation is too important to be left to chance…” still applies very much today.

Broader Takeaway — {(CS)(P)(T)}RNG’s are generally closed source/design or use purely deterministic algorithms without adequate notification to developers, and this is a problem given how crucial decent random number generation is for cryptography, esp. in IoT/IIoT/embedded systems/etc.

Additionally, ‘True’RNG’s are closed source, closed design black boxes, and that’s bad.

Here’s an example of the problem in action:

This is not a surprise to anyone in security, esp. those specialising in hardware/embedded/IoT/IIoT/etc.

Background — This is a phenomenon that I first realised three years ago, and having seen literally zero improvement, I thought I’d write it up. We need to start putting pressure on IC manufacturers and designers to improve this stuff.

Read this if:

you’re a developer and seriously considered using ATmega AVR IC’s for any product that does more than make lights flash
you’re a developer/maker that wants to implement crypto on the arduino (spoiler alert — don’t bother…)
you’re in the IoT/embedded security space, but not looked at RNG’s and want to know how bad it really is

Why Randomness is Important

There are a few key parts of cryptography that make it work:

Decent cryptography schemes that are effective at mixing information according to a key such that getting the plaintext back is ‘sufficiently hard’
Reasonable implementations of ‘trap-door’/one-way functions that make it hard to recover the data that was used to generate the output (these form the basis of hashing algorithms)
Cryptographically Secure Pseudo-Random Number Generators (CSPRNGs) that generate random numbers for use in all of our algorithms

(NB — yes, I know this list glosses over a lot, but we’ve got a narrow purview for this article…)

Decent Random Number Generators (RNGs) are important because of the way in which modern cryptography relies on random numbers. The most common uses for random numbers are in the generation of

keys,
nonces,
salts,
and one-time pads.

Even if you’re no crypto expert, it’s clear that these things are ‘very sodding important’ in the implementation of cryptography.

So, what’s special about AVR’s?

Well, they use a fully deterministic algorithm. The default seed is ‘1’, and it’s probably a Linear Congruential Generator (LCG), (somthing I would like to check) but it is fully deterministic from when random() is called in the arduino code. So far, so normal.

The difference is that an LCG in, say, the Linux kernel is updated constantly with new random bits from a large pool of entropy. Essentially, LCG does the generation, the entropy keeps it nice and random, so it is totally impractical to predict the next random number. Arduinos don’t do this entropy updating, so they are stuck with simply generating random numbers from an LCG.

This is problematic as it means that an unaware developer will just call random() without considering that maybe it’s broken if not seeded with randomSeed() before use. However, even if it is seeded, if the seed is the same for every device, then this is easily breakable by effectively bruteforcing all values (see example and discussion below).

Core takeaway — cryptography is essentially untenable on Atmel AVR IC’s such as ATmegaXXX series.

But why is this a problem?

Surely the answer is just ‘don’t use AVR’s…’?! Well, yes and no.

This situation is indicative of a larger problem in IoT/embedded systems. There are no set standards for RNG’s of any kind in hardware. STM32 series IC’s use an ‘acoustic circuit’ (probably an amplifier and flip-flop circuit with a NOT-based feedback, maybe like this but it isn’t publicly available…), and other IC manufacturers probably use similar solutions.

Whilst the STM32 implementation is seemingly a ‘True’ RNG i.e. uses a circuit that is very sensitive to some parameters of its design, e.g. temperature, and generates random bits from this circuit to seed a PRNG, or outputs the bits directly to the CPU — there is no guarantee it is genuinely useful or effective, with it being a closed black-box system.

Main point — When bits of your cryptosystem are ‘black boxed’, it is hard to justify the security of that cryptosystem.

It doesn’t help that even companies like ARM only publish scant details of how RNG’s should look like (ping me if you know a publicly available schematic!!) — they define operating characteristics and where RNG’s should go, but don’t stipulate nor make available actual designs for general, public scrutiny.

PoC||GTFO

Being a believer in this mantra, here’s some basic Arduino code that was compiled and loaded onto an ATmega168 chip this morning (whilst I had brekkie in a decent cafe):

void setup() {
  Serial.begin(9600);
  Serial.println("Random number test...");
}void loop() {
  for(int i=0;i<10000;i++){
    Serial.println(random(), DEC);
    delay(1);
  }
  delay(10000);
}

And here is a gist (https://gist.github.com/unprovable/439ca83df6ca5173bbf3a8544be13db0) of the results (CSV) of 2x 10,000 item outputs, with a third column comparison using the spreadsheet formula =if(a1=b1, "MATCH", "DIFF") .

Here is a second gist of my raw data (https://gist.github.com/unprovable/15a3f955238d075cb500b79cb2c59a76), for comparison to others — this is just the output of screen -L /dev/ttyUSB0 9600

NB — yes, I noticed the first run generated 20,000 values. No, I have no idea why. Yes, I’ve reproduced it. No, it shouldn’t be doing that.

Summary of Results — There are 10,000 samples from the first run off the Arduino. This was then reset, and the arduino proceeded to generate precisely the same 10,000 samples when power cycled. An independent test has confirmed that is repeatable using the simple program above.

In short — the random() function is completely predictable. This would mean that, were it used in some cryptographic function, it is certain that the key, nonce, IV, OTP, etc. could be fully recovered and/or predicted. It is an open question as to how many samples would be required to determine what seed value was in play, but in all likeliness, with a decent rainbow table, this number could potentially be in the single digits (3 or 4).

So what do we do now?

Clearly ATmega and other AVR’s are ‘much less than ideal’ when it comes to generating random numbers. But suppose you’ve got a prototype ready to go, almost in production, and retooling it would be impractical. What can be done?

Use Some High-Entropy source already on the IC

Ideas from twitter for this included the following seed sources:

Using the Watch Dog Timer (WDT) jitter
Using ADC conversion jitter
Using an RC discharge through an ADC pin (see here)
Using camera image/camera noise (if available — not usually found on Arduino-based projects)
Using the LSBs of the internal thermistor output

These are all good ideas! I’ll be looking at them in turn in the near future, and publishing results as they come in. Please do send additional sources!

Per-device secrets.

Yes, these again. Yes, I know I go on and on and on about them. But they would reduce the risk — Kerckhoff’s Princple being my guide, if each device gets its own genuinely random seed value for it’s PRNG (Pseudo-RNG) then this would help alleviate the problem; one device’s compromised firmware would not automatically compromise the whole production run.

Adding HSM with Decent True/PRNG

This is another option — adding a chip that does this stuff for you, and sacrificing a GPIO pin or two to communicate with this module. Whilst very effective, it could potentially increase the cost of your production significantly.

Move to a chip with FIPS 140–2 Certified RNG

Not that I particularly rate FIPS standards, but this is as good a baseline as any, and it’s easy to search for devices/IC’s that conform to this standard. (Holding fast on another mantra — “Don’t let perfect get in the way of better…”)

Most ARM based chips will conform to this if they have TrustZone.

Moving forward…

Here are the things I think should happen in order to mitigate problems in this area:

Hardware vendors should make schematics of their ‘True’RNG’s publicly available for scrutiny.
Hardware designers should give decent reference implementations for use in general.
NIST should add ‘repeatability upon reset’ to their RNG analysis requirements. AFAICT, their tests don’t currently cover this.
Hardware Hackers should publish example code and results of random number tests publicly for analysis (this has already started to be done — https://github.com/trezor/rng-test)