Mnemonics 101

A simple guide to mnemonics

Onbloc
Onbloc
6 min readApr 26, 2022

--

TL;DR:

1. A mnemonic (a.k.a. seed phrase) is a randomly generated set of words used to derive private keys for crypto wallets.

2. Handcrafting a mnemonic is possible (find out how below), but it’s a bad practice.

3. There are 2¹²⁸ combinations of possible software-generated mnemonics, which takes approximately 790 trillion years for a supercomputer to completely brute force.

Intro

A crypto wallet serves as a gateway that enables users to enter the world of blockchain. Every time a wallet is created, a 12 or 24 words long mnemonic is generated, and the owner is encouraged to store it somewhere safe and secure. Forgetfulness can cost you dearly, as losing the seed phrase can get you locked out of your wallet forever.

You might be wondering what exactly is a mnemonic, and why do they have to be random? Well, read on and you’ll find out what secrets your mnemonic actually contains, how it’s generated, and a trick to keep it secure.

What are Mnemonics?

A common misconception that most people have on mnemonics is that they’re created by converting private keys into English words. The reality, in fact, is quite the opposite. A mnemonic is essentially a “randomness” algorithmically generated by your wallet software that is used to derive private keys that hold your crypto. The key point here is that mnemonics must be deterministic, meaning that any wallet should be able to reconstruct the same private key using a mnemonic generated from an arbitrary wallet. Hence, the mnemonic you generate on MetaMask can be used on Coinbase Wallet to recover your private key, and vice versa.

How are Mnemonics Created?

Let’s take a look at how mnemonics are created.

*A few vocabs to learn before diving in*

Entropy: The randomness collected by an application for use in cryptography that requires random data.

SHA-256: The abbreviation for Secure Hash Algorithm 256-bit. SHA-256 produces irreversible and unique hashes by converting any arbitrary data into 256 bits of hash value. 5 traits of the SHA-256 algorithm are:

1. One-way: Data can’t be restored from its hash value.

2. Deterministic: Same input must yield the same hash value.

3. Fast Computation: Hashing must be done promptly.

4. Avalanche Effect: A small change in data must yield a completely different hash value.

5. Low Collision Probability: The probability of different inputs yielding the same hash value must be astronomically low.

BIP-39: A bitcoin improvement proposal that introduced the concept of mnemonics.

Source: https://github.com/ethereumbook/ethereumbook
  1. Upon creating your wallet, the first thing that your wallet software does is to randomly generate a 128-bit entropy. This is basically a 128-digit long data consisting of 0s and 1s.
  2. Your 128-bit entropy is hashed using the SHA-256 hash function.
  3. The first 4 bits of your hashed entropy are appended to the original entropy. The 4 bits are called the Checksum. The data we have now is 132-digit long.
  4. The 132-bit data is split into 12 segments of 11-bits each.
  5. Each segment is mapped to the English World List on the BIP-39 Github Repository.
  6. You end up with 12 words — this is your mnemonic!

Seed Phrase DIY

You might be thinking — wait, does that mean I can generate my own seed phrase by handcrafting a 128-bit long data with 0s and 1s? You’re right! It’s possible to generate your own mnemonic.

Let’s try building one ourselves:

1. First, let’s manually create a 128-bit long entropy. For the sake of convenience, the first 121 bits will be a sequence of repeating 11-character long 0s and 1s, and the last 7 digits will be 1s.

2. Next, we’ll hash our entropy using an encryption tool. Be sure to set up the input and output as binary, and the hash function as SHA-256.

3. Let’s append the first 4 bits of the output (the checksum) to the original entropy.

4. Time to map our 132-bit data using the BIP-39 word list. Let’s split our data into 12 groups of 11-characters each, and convert the segments into decimals.

5. We need the 1st (0), the 2048th (2047), and the 2035th (2034) words to convert our data into a seed phrase.

6. There we have it! Our mnemonic is:

abandon zoo abandon zoo abandon zoo abandon zoo abandon zoo abandon wrestle

7. Let’s try importing our seed phrase into MetaMask to check if it works.

8. Done! We’ve successfully handcrafted our own mnemonic!

Why You Shouldn’t Be Building Your Own Mnemonic

Okay, now that we’ve learned how to create our own mnemonic, let’s choose some words that we can actually remember, convert them into binary, get the checksum, and generate a memorable seed phrase!

No. This is a terrible idea, as the mnemonic you handcraft is likely to be “not random enough”. The security of crypto wallets is reliant on the astronomically low probability of two randomly generated entropies being the same.

As you might have guessed, brute forcing a crypto wallet is “theoretically” possible. Brute forcing in cryptography refers to an attempt of an attacker plugging arbitrary data into the hash function repeatedly to find the input that matches the desired hashed value.

So, how secure is your “randomly” generated wallet?

Let’s assume that you’re trying to find the exact entropy that leads to a specific wallet. Since the entropy is 128-bit long, there are 2¹²⁸ (= 340,282,366,920,938,463,463,374,607,431,768,211,456) possible combinations. According to Steven Alexander, a forensic expert, the super computer of the U.S. National Security Agency can brute force approximately 2⁷⁰ keys per day. Using this computer, it would take around 790 trillion years (note that the Sun is 4.6 billion years old) to be 100% sure that you’ve gained access to the exact wallet that you’re looking for.

A Small Tip on Storing Your Mnemonic

Although the best practice is to store your mnemonic somewhere offline, a majority of people tend to store them on cloud storage. Here’s a simple trick that could keep your funds safe even when your cloud storage gets compromised: mix up a few words of your mnemonic!

For example, let’s say that your seed phrase is :

crime festival orange sort host innocent bright angle release fan valley remain

Instead of saving it as it is, switch the order of the first two words and the last two words. The result would be:

festival crime orange sort host innocent bright angle release fan remain valley

Unfortunately for the hacker, the mnemonic above would result in an invalid secret recovery phrase. This adds an additional layer of security to your seed phrase. Just be sure to remember how you switched them!

Conclusion

The Motivation Section of the BIP-39 clearly states that “This guide is meant to be a way to transport computer-generated randomness with a human-readable transcription. It’s not a way to process user-created sentences (also known as brainwallets) into a wallet seed.”. Despite the convenience, a handcrafted mnemonic resulting from data with patterns or sequences such as the one we created in our DIY above, should never be used, as it’s likely to get exposed.

Rest assured, as long as you leave the entropy generation to the software, your wallet is safe, thanks to the art of cryptography.

We hope this guide gave you a better understanding of the safety and the logic behind seed phrases.

--

--

Onbloc
Onbloc

A blockchain software development firm based in Seoul