Hash-based digital signatures (almost) from scratch

Georg Wiese
8 min readNov 5, 2022

--

Digital signatures are used anywhere on the internet and beyond: Whether to protect your browser traffic via TLS certificates, authenticate your Emails, or secure your Bitcoins. They come in many flavors, but most schemes involve advanced cryptographic concepts like Elliptic Curve Cryptography.

When reading Chapter 14 on hash-based signatures of “A Graduate Course in Applied Cryptography” by Dan Boneh and Victor Shoup, I was fascinated by the fact that digital signature schemes can actually be built from just a single cryptographic primitive: A hash function.

To me, this simplicity was conceptually appealing. But there is a very real advantage here when it comes to security against quantum computers: While most currently used schemes would be broken by a large enough quantum computer running Shor’s algorithm, no generic quantum attacks (better than Grover’s algorithm) are known against hash functions. As long as we can build a quantum-secure hash function, we can plug it into a hash-based signature scheme and prove security.

So, with this motivation, let’s dive into some of the concepts that make this work. Toward the end of this post, I’ll point you to my GitHub project of a working Rust implementation and a web app to verify signatures generated with it!

Hash-based one-time signatures

A hash function is a type of one-way function: You can take a random input x and efficiently map it to a (seemingly random) output y, but inverting this function is assumed to be infeasible:

Now, let’s build a one-time signature scheme that allows you to sign a one-bit message:

  • Choose two random inputs, x₀ and x₁. These make up our secret key.
  • Publish h(x₀) and h(x₁). This is our public key.
  • To sign bit 0, publish x₀; to sign bit 1, publish x₁.

A verifier can easily apply h to the signature and verify that it hashes to the right element of the public key. At the same time, forging a signature on the other bit without knowing the secret key would involve inverting h, which we assume to be infeasible.

This scheme is trivially generalized to allow signing more bits: Just generate a new key pair for each bit. For example, to sign a 256-bit message, the public key would consist of 512 hash values (2 for each bit) and the signature would consist of 256 inputs (one for each bit). Being able to sign a 256-bit message allows us to sign a message of any length because we can always hash a longer message to 256 bits and sign the hash.

This scheme was invented by Leslie Lamport in 1979 and its core idea has since been developed a lot further. I don’t need to go into detail here, because there is an excellent blog post on the subject:

Of the schemes described there, I’m using the Winternitz signature scheme. It has a parameter to trade off signature/key size and signing/verification time. In the configuration I implemented as the default, both keys and signatures consist of 76 hash values to sign a 256-bit message — quite a bit smaller than the original Lamport scheme!

However, assuming 256-bit hashes, this means that a single one-time signature is already 76 × 256 ÷ 8 Bytes = 2.4KB in size! So when building our signature scheme, we’ll want to use them as rarely as possible to keep signatures small.

From one-time signatures to stateless many-time signatures

A one-time signature scheme by itself is not very useful, of course. We’d like to use a given signing key as many times as we want.

The key idea to achieve that is to (conceptionally) generate a massive amount of one-time key pairs (say 2¹²⁸), and use a random one for each signature. The number of key pairs needs to be so large that it is extremely unlikely you’ll choose the same pair twice in the key's lifetime.

To make this idea practical, we need to solve two problems:

  1. We don’t want to distribute 2¹²⁸ public keys to our friends when they only need one to verify the signature.
  2. We don’t even want to generate these 2¹²⁸ key pairs ourselves, which would take an infeasible amount of time.

Shrinking the public key size

The first problem can be solved using a Merkle tree.

Suppose for a moment that we don’t have 2¹²⁸ public keys, but just 8. Then, we can build a binary tree of hashes by repeatedly hashing neighboring elements, like this:

The public key we’ll actually distribute is the topmost hash, called the Merkle root. Now, suppose we used pk₅ (shown in green) to sign the message. We append to the signature:

  • pk₅ itself, as the verifier will need it to actually verify the one-time signature.
  • All fields shown in yellow. This will allow a verifier to re-compute the public key that (s)he already knows, which will prove that pk₅ belongs to the set of keys that were used to compute the public key. This is called the Merkle proof.

Reducing the number of keys that need to be generated

Applying the above idea to all 2¹²⁸ key pairs would mean that we’d have to actually generate 2¹²⁸ key pairs in order to compute the root hash. This would be way too expensive, of course.

Instead, assume we use the Merkle tree construction described above to just compress 16 one-time keys. When signing a message, we can pick one at random, pseudo-randomly generate another set of 16 one-time key pairs, compute another Merkle root, and sign this Merkle root using the first key pair. Iterating this approach, we build a chain of “intermediate” one-time key pairs. The last one in the chain is used to sign the actual message.

You can think of it as selecting a random leaf in a tree where each node has 16 children. This figure illustrates a chain with just one intermediate key:

This tree can quickly become very large: For example, if we make it 32 levels deep, the number of leaves is 16³² = 2¹²⁸! But in the process, we only had to generate 16 × 32 = 512 one-time key pairs, which can be done in a fraction of a second.

I want to stress that when generating another set of one-time key pairs, it is important that this happens in a pseudo-random way: If we happen to choose an intermediate node multiple times, we must generate the same child key pairs (and hence obtain the same Merkle root). Otherwise, we would sign two different messages using the one-time key pair, which would break security!

The full algorithm

In summary, key generation, signing, and verification work as follows:

Key Generation:

  1. Generate 16 one-time signature key pairs
  2. Compute a Merkle root hash of the 16 one-time public keys
  3. Return the Merkle root hash as the public key, and the 16 one-time secret keys as the secret key (or a seed that generates them)

Signing:

  1. Pick a random one-time key pair, and compute a Merkle proof that convinces the verifier that the chosen public key was actually included in the given Merkle root
  2. Repeat the key generation algorithm to obtain another set of 16 one-time key pairs and the respective Merkle root
  3. Sign the Merkle root using the one-time key pair picked in step 1
  4. Repeat steps 1–3 32 times, obtaining 32 one-time public keys, signatures, and Merkle roots
  5. Finally, use the last sampled one-time key pair to sign the actual message

The following illustrates the signing process:

Verification:

  1. Validate all Merkle proofs, using the public key as the first Merkle root
  2. Validate all signatures

A Rust implementation

Having learned Rust recently, I decided to put the theory to practice and hack together a command-line tool + web app (for signature verification) that implements a signature scheme that is solely based on hash functions.

Check out my GitHub project:

Now, I want to stress that this is a toy project. The code has not even been reviewed, and even for teams writing security software, it is easy to introduce bugs that completely break security.

Still, it’s fun to play around with it! In the repository, you can find an example where I signed this file, obtaining this signature. You can verify yourself that this was done using the following public key:

9e2543961faafa9a021752ad7598170472e688988ad1fa66a33dc65945385194

The easiest way to do that is by the signature verification web app, which is a thin wrapper around a WebAssembly compilation of the code!

Room for improvement

The ideas presented in this post are not just interesting from an academic perspective: Just this year, NIST announced SPHINCS+ to be part of its post-quantum cryptographic standard. SPHINCS+ is, as the name suggests, the next iteration of SPHINCS, which implements all of the ideas presented in this blog post. I highly recommend reading their paper!

The main challenge when designing hash-based signatures is to reduce the signature size: The signature above is 119KB in size — too large for many applications. SPHINCS replaces the one-time signature scheme with a few-times signature scheme. As a result, it’s fine if leaves in our tree are sampled more than once, and we can reduce the size of the tree to make the signatures smaller!

Another thing to optimize is speed. On my machine, signing a file takes around 300ms. This is already using a multi-threaded implementation. In the SPHINCS paper, the authors go quite a few steps further and use Intel’s AVX instruction set to speed up computation.

Summary

This post hopefully gave you some intuition about how hash-based signatures work: We first construct a one-time signature scheme, relying on the one-way property of the hash function. Then, we essentially build a massive tree of one-time signature key pairs and sample a random leaf to sign a message.

The linked Rust implementation shows that using these simple ideas, we can build a practical signing tool. I hope you enjoy playing around with it! Can you forge a signature?

Acknowledgments

I’d like to thank Valentin Pinkau for starting off this project together and for his helpful feedback on this post :)

--

--

Georg Wiese

Software engineer, interested in Machine Learning and Cryptography.