An explanation of Meltdown and Spectre for non-programmers (Draft version)

There has been a great deal of media reporting of two new, related security flaws, “Meltdown”, and “Spectre”, in computer and smartphone CPUs. The following is my attempt to explain, by analogy, what the general problem is and how it can be exploited by hackers to access information that they shouldn’t have access to.

Please note that this is intended mainly to satisfy individual curiosity, NOT as a substitute for consulting expert opinion on what to do about the problem. Large organizations should have IT security staff who will provide advice. Individuals and small businesses should follow instructions from their software suppliers (CNET has a handy list of instructions for the major OS and browsers here). Mostly this will mean installing updates promptly. Yes, that means you. Don’t put it off! Some of them may result in your system working slightly more slowly. Better running a tad slow than hacked.

With that out of the way, here is my explanation…

In the original James Bond films, Bond’s primary enemy was a multinational criminal organization named SPECTRE.

Let’s imagine that SPECTRE’s headquarters has a library, which contains secret books which only SPECTRE employees are allowed to read, and ordinary books, which any member of the public can come in and read (SPECTRE is always looking for good PR). Unfortunately for SPECTRE, their library has some “interesting” design decisions.

Anyone can visit SPECTRE’s library, but patrons are not allowed direct access to the bookshelves. Instead, there are two staff members on the front counter — the retrieval clerk and the security clerk.

The retrieval clerk is responsible for retrieving books from the archive shelves. As well as the main archive, the retrieval clerk can access the “recently-retrieved shelf”, a small shelf below their desk where the most recently retrieved books — regardless of who asked for them — are kept. When retrieving a book, the retrieval clerk first checks the recently-retrieved shelf before heading off to the main archive. The shelf is not visible to visitors, and the clerk will not explicitly tell a visitor what is or isn’t on the recently-retrieved shelf. However, retrieving a book off this shelf is much faster than walking into the main archive.

When a visitor asks for a book, it is the security clerk’s job to check whether they have permission to access that particular book. Unfortunately, the security clerk is sometimes very slow at their job, and the retrieval clerk is impatient. Therefore, the retrieval clerk can hand the book over, and the visitor has some time to read it, before the security clerk has finished checking. However, in the past, this hasn’t mattered. Once the security clerk has determined that a non-SPECTRE employee has requested a secret book, SPECTRE simply drops the naughty visitor into their shark tank, and no secret information is disclosed.

MI6 decides they need to determine the contents of a secret book in the library: “My New Plan for Taking Over the World: This Time It’ll Really Work”, by Ernst Blofeld. Unfortunately, their last agent in SPECTRE met a rather sticky end. What can they do?

First, MI6 sends in a brave low-level agent who asks for the secret book. They quickly look at the first letter in the book, which happens to be “I”. They then ask the retrieval clerk to get Volume 9 of the Encyclopedia Brittanica, which contains the entries for “I”, which the retrieval clerk does, and hands the book over. At that point, the security clerk finishes checking, determines that the MI6 agent is not permitted to read Blofeld’s book, and summons the color-coded guards to take the agent off to the shark tank. The retrieval clerk collects both books and, importantly, puts them on the recently-retrieved shelf. Therefore, the presence or absence of a non-secret book on the recently-retrieved shelf depends on the contents of the secret book. To figure out (part of) these contents, all that is required is a way to tell what’s on the recently-retrieved shelf.

Bond then saunters into the library, and suavely asks the guard for Volume 1 of the Encylopedia Brittanica. Q has equipped him with a special watch for the mission; it has the amazing ability to…act as a stopwatch. He times how long it takes for the retrieval clerk to retrieve Volume 1. The security clerk is untroubled — the Encyclopedia Brittanica isn’t on the secret list. Bond then asks for Volume 2, then Volume 3, and so on, all the way to Volume 26.

Bond’s stopwatch eventually reveals that retrieving Volume 9 was much faster than any other volume. He leaves the library and heads down to the hotel bar for a martini, knowing the first character in the book is “I”; and, while it will be a slow and costly process, MI6 can eventually access the contents of the entire book.

What’s this got to do with the Spectre and Meltdown vulnerabilities? In short:

  • Bond and the low-level MI6 agent are programs running on the target system that a hacker has either written or is manipulating.
  • The library archives are the computer’s memory
  • the secret book is information in the computer’s memory that the “MI6 agent programs” shouldn’t have access to.
  • the recently-retrieved shelf is “cache”, special fast memory used to speed up access to recently used data stored in the main memory.
  • The security clerk is the part of the processor (and/or software) that checks whether a program is permitted to access a particular part of memory.

The basic problems exploited in both vulnerabilities (though the details differ):

  • in modern CPUs, sometimes one part of the CPU continues executing a program even while another part of the CPU is determining whether those actions should be allowed.
  • If operations happen that turn out should not have been allowed, the CPU attempts to clean things up as if they had never happened.
  • Unfortunately, not all the evidence of the non-permitted operations is cleaned up, specifically, the contents of the memory cache can be altered in predictable ways that are not cleaned up.
  • The state of the memory cache, while not directly accessible by user programs, can be inferred by measuring the time taken to access different parts of memory.

Author’s Note: After writing this piece, I’ve seen this Twitter thread by Joe Fitz which made essentially the same analogy. I hadn’t seen his tweets before writing this piece. Like Meltdown and Spectre themselves, this was independent discovery :)

Cycling and politics tragic who teaches software engineering when not wasting time on the internets.