An extended public key isn’t just any old public key. It needs to be handled with care. It is safer to assume that this sequence of bytes has been leaked. If it has, how can we continue generating private keys in a safer way? Hardened derivation to the rescue.
Hierarchical Deterministic (HD) Wallets standardise the way we can think about a set of ECDSA keys. Striving for a more user-centric experience, BIP32 tries to change the way software wallets handle key generation, thereby eliminating the need for backing up pools of keys that are created over time.
From a users perspective, they should only have to remember one 12 word mnemonic, often referred to as a seed phrase (BIP39). All BIP39 compliant wallets should know how to process this seed phrase. A feature that aimed to encourage interoperability between multiple wallet implementations.
The aim of this article is not to introduce the basics of HD Wallets, but to dive right in and explain why hardened derivation matters. Before we can explore this question, we need to understand derivation at a higher level.
What is derivation?
For simplicity, we can picture it as a mechanism used to generate a specific ECDSA key-pair within a tree of keys. Given a parent extended key and an array of indexes, you will be able to deterministically regenerate keys at the specified location in the tree.
Each node in the tree can have a maximum of 4,294,967,296 child nodes (2³²/32-bit unsigned integer). Additionally, the depth of the tree is infinite. Resulting in a desirable ability to generate an unbounded number of keys from a single master seed (this is just an array of random bytes of a predefined length, see master key generation).
As a primer, let’s introduce the creation of an extended private key as it is an important variable throughout each derivation function. This test vector (Test Vector 1) is taken from the original BIP32 proposal. We’re only going to focus on two fields that make up an extended key; the chain code and the key data (actual public or private key bytes). If you’re interested in the full serialisation format, please see BIP32 serialisation format.
In the meantime, I’ve cobbled together some code that demonstrates the creation of the BIP32 test vector 1 extended keys:
Now we know how a seed is transformed into an extended key we can start to ask the question, how do we derive child keys given our master extended keys?
The answer lies in the conveniently named, Child Key Derivation Function.
How does the Child Key Derivation (CKD) Function work?
A concise and heavily notated description of this function has already been defined on the related BIP. However, instead of pulling your hair out, I’ve saved you the headache and extracted the important parts (without compromising on detail, see Figure 2).
There are three valid permutations for the child key derivation function. The fourth is impossible. This is because it is assumed that finding the discrete logarithm of a random elliptic curve element with respect to a publicly known base point is infeasible.
- Parent private key → private child key (Blue and Yellow line — Figure 2)
- Public parent key → public child key (Green line — Figure 2)
- Private parent key → public child key (Green line — Figure 2)
- Public parent key → private child key (Impossible — Red line — Figure 2)
Every key that is created through the CKD function will still need to be correctly serialised to be become an extended key. Both private and public keys at the same depth share the same chain code.
This is it, the child key derivation function. Exactly how the original BIP described it.
It’s worth noting that a single iteration of this flow chart constitutes one level of depth in the tree of keys that we mentioned earlier. When we hit any “end” point of the chart, we can recursively use the resultant values to go deeper by beginning again at the “start” point. Please take a minute to study the flow chart before moving on. It’s used as a point of reference in the next section.
Why Hardened Derivation Matters?
Alice is a new Bitcoin user and she’s opted for the unconventional approach of managing her own HD wallet. Note, this is not recommended. It’s error-prone, very tedious and… error-prone. Do not do it!
She starts out by flipping a coin 512 times. 256 times to get a 32-byte private key and the other 256 times for her 32-byte chain code. She serialises it and this becomes her extended parent private key.
Siobhan, her friend from university says that she’s opened an online book store that accepts Bitcoin as payments. Alice and Siobhan come to an agreement that if Alice helps her understand Bitcoin, then Siobhan will let her list any amount of books on the website, free of charge. Alice will get 100% off the profit.
Alice being Alice, wants to make sure that she receives all the Bitcoin from her book sales. She could rely on Siobhan to send her the Bitcoin for every purchase but she doesn’t want this. Alice wants the Bitcoin sent directly to her wallet.
Alice has a few options at this point. All of them rely on her exposing some data to Siobhan about her wallet.
To narrow down the options, another requirement that Alice needs is increased privacy. She wants to make sure that every payment she receives for each book is to a new Bitcoin address.
She decides to follow the yellow arrow on the above flow chart. She uses her parent extended private key to create a non-hardened child private key. In doing so she needed to perform some arithmetic with her parent private key (Keep this in mind, as this is a very subtle and important point, child private key =( left 32 bytes + parent private key) % n).
Naively, she hands Siobhan her child extended private key. At this point, Alice is unaware of the dangers that she has now exposed her entire HD wallet to.
As we know, Siobhan is not very well versed in the world of Bitcoin. However, Alice did teach her how to create new addresses for every book that she’s listed. Siobhan creates a new address for each book by following the green path on the flow chart above, each time incrementing the index to get a different public key which can later be parsed into a Bitcoin address (preferably Bech32). She does this using the non-hardened child private key that Alice shared with her.
Time goes by and Alice sells a lot of books, making a gross total of ₿2.0. Meanwhile, Siobhan’s mate Bob says that he can help her run the website. Siobhan hasn’t known Bob for that long but knows that he’s somewhat of cryptocurrency enthusiast. Siobhan thinks Bob will be able to help a lot with the smooth operation of the website.
A few weeks go by when Alice tries to spend some of her Bitcoin in a local coffee shop. Since then she’s used her extended parent private key to create a lot more child keys. She tries to spend a UTXO that she thought was stored at an address generated from an entirely separate child public key to the one that Siobhan has. Alice can’t buy her coffee. In fact, all the Bitcoin in her entire HD wallet is gone, including the Bitcoin residing at the addresses that Siobhan generated. What has happened? Her parent private key has been compromised, but how?
Let’s rewind and explain how this happened. Alice’s first mistake was trying to do all of this herself. Her second mistake and most important was giving Siobhan a non-hardened extended child private key.
On Bob’s first day working for Susan, he found this key embedded in the source code (don’t ever store private keys this way please). Bob knows that he can make a quick buck and easily take all of Alice’s profits. Instead, he plays the long game in the hope of a bigger payoff.
He took this opportunity to befriend Alice and asked her for her extended parent public key. Naturally, Alice gives him it thinking it’s completely safe. Unbeknownst to Alice, Bob is actually asking for the last piece of the puzzle. It will allow him to wipe out Alice’s entire wallet instead of just her book sales.
Bob’s goal is to retrieve the extended parent private key. He has all the data at his disposal to perform this attack. At this stage, it’s just a matter of employing some simple algebra to solve for the parent private key instead of the original child private key i.e.
child private key = (left 32 bytes + parent private key) % n
Bob solves for parent private key:
parent private key = (child private key - left 32 bytes) % n
He’s already got the child private key from the source code and the left 32 bytes from performing the same operation Alice did originally to give Siobhan her extended child private key. He can now perform the calculation and capture the root private key, controlling the whole wallet.
How could Alice have avoided this attack vector? The answer lies in hardened derivation. If Alice had of given Siobhan an extended private key that was generated using the blue arrow on the flow chart, then Bob would never have been able to retrieve the left 32-byte operand to solve the equation. This is assuming Alice keeps the extended parent private key safe! In reality, giving Siobhan the extended child public key only would’ve sufficed. There is no good reason for Siobhan to have any private keys held on her server. She never needed to sign any transactions.
To summarise, a hardened derivation is useful at a depth in a HD wallet where you anticipate scenarios that require sharing your extended public key. You know then that any leaked child keys beneath will not leave the parent private key vulnerable.
Following the tree structure defined in BIP44 and proper key management will alleviate any possibility that you’re vulnerable to this attack.
Below you can see the output from a program that mimics the above scenario. I’ve used Test Vector 1 from BIP32 as Alice’s parent private key. You can see that the attacker does in fact retrieve Alice’s master private key. If there is any interest, I can publish this code to my Github account.
Any other questions, please leave a comment and I’ll try to get back to you.