Privacy in Blockchain Networks and Benefits of HD Wallets

Be Tech! with Santander
Be Tech! with Santander
18 min readJan 10, 2024

By Juan Tavira.

When talking about Blockchain there is a common misunderstanding that Blockchain networks ensure privacy and anonymity. Let us look at a few tricks to improve privacy with this technology and the role of hierarchical deterministic wallets (HD Wallets)

Blockchain networks are anonymous as long as you only operate inside the network. Any interaction outside the network would break that anonymity. For example, you may have some cryptocurrencies you mined long time ago. They are stored in your account and nothing connects them to you. But at the moment you want to convert them into fiat currencies (USD 💵, EUR 💶, etc.) a link is established between your Blockchain account and your bank account, this means to you. This is due to regulations like KYC (Know Your Customer) which force banks and exchanges (the platforms that convert cryptocurrencies into fiat currencies and vice versa) to identify their customers.

Blockchain.

Given this, is there any privacy? 🕵️

With Blockchain threre are some grey areas in terms of privacy rights for personal information. Simplistically, one could say that there’s no privacy in Blockchain networks. Anyone who joins the network with a node can have a copy 🧑‍💻 of all the data in the Blockchain network and thus analysing them. Transactions can be exported to data tools that reveal all the details contained in them. Unless the data is encrypted before it is sent to the Blockchain network the contents of the transactions are public. Privacy solutions typically require off-chain mechanisms.

It is also important to understand how data is written to the Blockchain network. In a “write” operation 📝, a transaction like a payment, creating an NFT or sending a token to another address one of the steps is to sign the transaction with your private key. Your address is in that signature. Therefore all the transactions you perform are linked to the same address your activity cannot be considered private.

Privacy is of particular interest in systems that manage personal information. The digital identity paradigm talks about solutions where citizens are able to manage digital versions of their personal information 📇. However, as we have explained, every transaction a user makes on a Blockchain network leaves traces that can be tracked and analysed along with his behaviour and relationships. Even if the Blockchain network itself does not process personal data of the user (personal data cannot be stored in a Blockchain network as this would not be compatible with the General Data Protection Regulation, GDPR), it is his interactions, volume of activity and habits. These indicators are part of the citizens digital identity.

What are Hierarchical Deterministic Wallets? 👛

Hierarchical Deterministic Wallets or HD Wallets are digital wallets for cryptocurrencies that, in some cases, can solve the privacy problem. For simplicity let’s say that HD Wallets can generate new addresses from a root master key. In fact as many as you need. This way you do not have to re-use the addresses you use and therefore the transactions cannot be linked between them or to you. Also thanks to the cryptography used in HD Wallets it is possible to prove that two different addresses are derived from the same master key 🗝️ which only you know and control. In this way the relationships between the addresses can be proved to a third party.

Of course, there are problems that HD Wallet cannot solve. If you use a network where you have to pay 💳 for usage (this is usually referred as “it requires gas”) in order to execute a transaction you have to pay the execution cost of a transaction, fom the address that executes the transaction. There must be “gas” in this address, which must to be sent from somewhere. If you do it manually from an exchange it is quite tedious. If you do it form a central account the addresses would be linked to that one and to each other.

We can generalise that fungible assets (ERC20, ETH and similar) are not ideal for use with HD Wallets as fungible assets are useful when consolidated and consolidating from different accounts links them. But non-fungible tokens (ERC721 and others) on the other hand 🖐️, are useful independently, as they are unique and useful on their own.

📍In a summary, when using Blockchain networks it is important to know what you are doing and what you are sharing, just as you do with social networks. In addition, it is important that privacy is one of your priorities and it would be nice to use different addresses for different activities and interactions, if possible.

Keep reading here👇 for more technical details

There’s a common misunderstanding 🤔 that says that Blockchain networks are anonymous and inferring that the transactions in Blockchain networks are granted privacy.

Public Blockchain networks are anonymous as long as you only operate inside the network, but any interaction outside the network breaks the anonymity. For example, in the past you may had mined some ETH. Those are stored in your account, nothing links that to you. But in the case you want to convert ETH into fiat money (Euro 💶, Dollar 💵, etc.) there’s a link between your Blockchain account and your bank account, which means a link to you. This is due to KYC (Know your customer) in Blockchain exchanges and banks. KYC is a requirement for them.

But, what about privacy? Either in public or private networks?

There’s no privacy in public Blockchain networks. Anyone that could put a node in a network can get a copy 💾 of all the data and process it. This is complex technical process but it can be done. Data in the transactions can be exported. Unless you cypher it and send the data to the Blockchain network encrypted, then you can exchange keys outside the Blockchain network with others.

And there’s the activity link 🔗. Each time you do a transaction one of the steps is cryptographically signing 🖊️ it with your private key and your address is within that signature. All your transactions are linked to the same address, that it is not exactly private isn’t it?

What is an HDWallet? 👛

HDWallets, defined by BIP-32 standard was originally designed to help users to easily manage different Blockchain addresses.

Initially designed for Bitcoin (BIP means Bitcoin Improvement Proposal) it was later extended for other Blockchain platforms and nowadays is widely used in hardware wallets such as Ledger or Trezor.

Behind the HDWallets there is some complex cryptography to warrantee what we all want: 👮 security in our transactions. But HDWallets have at least one other use that is a game changing feature, something that we could use for privacy.

From a seed that is usually a set of 12 to 24 words. Using them a user can will create a root private key 🔐. From that one we can create a first level of millions of new keys. Keys that only the user will know that come from the same root. And from each one of those millions of keys in the first level we can create millions of derived keys, bringing the total amount of keys to billions and more on, you can repeat up to 255 levels. It doesn’t matter how many keys you use, they won’t run out.

This means that you can use a different key for whatever reason you may want, having those accounts isolated. Not only a user, companies, systems, automatic processes… anyone can use HDWallets because it is just a way of generating new keys, using a program, there’s no need of any special equipment.

But also, we will see later that a user can safely proof that he is the owner of two different keys that derive from a common branch of the tree 🌳.

Organizing the accounts in HDWallets 🗂️

So this way a user can have his digital assets using different keys/accounts, cool, isn’t it?

In order to keep things organized the different accounts are generated via something called “derivations” that are named like “m/44’/1” and fortunately for us there is another standard: the BIP-44 that helps organizing the naming convention.

👀 Let’s see how it looks:

“m / purpose' / coin_type' / account' / change / address_index”

▪️ m: Fixed.

▪️ Purpose: Purpose is a constant set to 44' when talking of crytpocoins.

▪️ Coin_type: This level creates a separate subtree for every cryptocoin, avoiding reusing addresses across cryptocoins and improving privacy issues.

▪️ Account: Users can use these accounts to organize the funds in the same fashion as bank accounts; for donation purposes (where all addresses are considered public), for saving purposes, for common expenses etc.

▪️ Change: Constant 0 is used for external chain and constant 1 for internal chain (also known as change addresses). External chain is used for addresses that are meant to be visible outside of the wallet (e.g. for receiving payments). Internal chain is used for addresses which are not meant to be visible outside of the wallet and is used for return transaction change.

▪️ Address_index: Addresses are numbered from index 0 in sequentially increasing manner.

BIP44 was designed for organizing cryptocoin accounts.

Let’s organize other use cases! 🗃️

We’ll name just ✌️ of them: digital identity and transaction privacy, but similar patterns could be used for other uses.

The digital identity case: Self Sovereign Identity

Part of those digital assets could be our 🕵️ identity documents; Self Sovereign Identity relates to digital information of a user that can be verified. The user also is in control of that information allowing or disallowing when it is used and the purpose of that use.

This user verifiable information is usually presented in the form of Verifiable Credentials.

So you have those VCs issued to your “digital you” where you need to be identified. The “digital you” is represented with a DID, a Decentralized Identifier Document, that’s it like you national ID in the digital identity space. The DID has at least a public key (the one associated with your private key) that can be used to verify when you digitally sign 🧑‍💻 a document (for example when you send one of those VCs to a company). So we can use the public keys in your HDWallet accounts as your DID.

🚨 Wait! 🚨

There’s no mention to identity in any of the levels of the derivations. Does that mean we cannot use the derivations for identity? Au contraire. We can. We just need to extend the standard. So we can choose a new Purpose, for example 1037171. So derivations starting with “m/1037171” will be used for identity purposes.

All our Verifiable Credentials could be used from the address in that derivation, or even more addresses from longer derivations.

Centralization or distribution?

Now we have to choose 🤷: should we centralize all out VCs in a single DID (equal to address, equal to derivation) or should we spread them between different addresses?

Centralization of all VCs under a single DID has the benefit of simplicity. When we have to share them with a third party, like a service provider, it is easy to check that all of them have been issued to the same DID, to the same digital identity.

In the other hand 🤚 centralization means that if anyone knows your DID (which should be public) and sees any credential can easily identity if it belongs to you.

Distributing the VCs in multiple addresses has the benefit of privacy. When someone knows your DID he will just know whatever you have shared with him. Even if he sees another VC issued to you he won’t recognize it because it would be issued to a different DID, initially un-linkable.

This is more than desirable 🙏, to the point that data protection agencies 🔒 may state that the centralization option, using a single DID, is not possible while using Blockchain because DIDs would be considered personal data.

In the case of using several DIDs the difficulty is cryptographically proving the link between them. It could be the case of your National ID issued to DID1, your driving license issued to DID2 and your 🚘 car insurance to DID3. How would you prove that you own the three DID1, DID2 and DID3 and that those are linked? Proving that would be necessary for you to rent a car.

Here comes the magic! HDWallets

HDWallets are not only useful for generating new addresses with a derivation path. The hidden 🕶️ benefit behind HDWallets are the cryptographic links between those generated addresses. The link is known to the user but not to the public until the user releases some information.

Starting with a seed or the famous 12-words we can create an HDWallet. This is basically a private key (privK0) from which we can have its public key (pubk0) and therefore an address (addr0) that will be like the DID0

Just remember that from private keys we can calculate public keys, but not the other way around, never from a public key you can calculate a private key. From a public key we can have an address but from the address we cannot have neither the public key nor the private key.

If we apply to privk0 a derivation like “m/132” we will have a new set privK1, pubk1, addr1 and DID1.

But, wait for it 🧙, if we take pubk0 and we apply the derivation “m/132” we will obtain the same pubk1, addr1 and DID1 albeit never privk1!!

HDWallets.

This way I can have:👇

✅ Privk0, using derivation “m/132” to have privk1, pubk1, addr1, DID1

✅ Privk0, using derivation “m/471” to have privk2, pubk2, addr2, DID2

✅ Privk0, using derivation “m/592” to have privk3, pubk3, addr3, DID3

✅ NationalID issued to DID1, Driving license issued to DID2 and Car Insurance issued to DID3.

Then we can pack 📦 NationaID, Driving license and Car Insurance for the Car Rental Company, sign the whole thing with Privk0 (this will remain secret) and send also Pubk0, “m/132”, “m/471” and “m/592”.

HDWallets.

Using Pubk0 we can verify the signature of the credentials package (called “presentation”). It is a proof of the control the Privk0 associated (but Privk0 is not known to 🚙 car rental company).

Then using Pubk0 and “m/132” car rental can calculate pubk1, addr1 and DID1. Doing the same for DID2 and DID3. This proves that DID1, DID2 and DID3 come from the same original Privk0 and therefore part of the same distributed identity. The car rental company now has all the necessary documents and proofs for renting me a car.

This is what we call SSI3 : Self Sovereign Identity and Identity Isolation because all the digital identity VCs are isolated and only the user who controls the identity can release the information that links them together.

And this is only the tip of the iceberg 🏔️, we can use much longer, deeper derivations for organizing identity: including a derivation level for the company you are interacting with, another for the purpose of the interaction (login, credential issuance, delegation…), etc. This would require a whole new paper as it would be a functional discussion not a technical one.

The benefit of identity privacy in Blockchain

We have talked about credentials, keys and address generation and the cryptographic principles. What is the role of Blockchain? What are the benefits?

SSI solutions typically use Blockchain as a registry of the activity, not the identity documents. When an issuer creates a credential for a user he does several things, among them he signs the credential and registers in the Blockchain network two things: He has issued the credential and the credential status is valid.

👉 When a user receives the credential he registers the reception.

👉 When a user presents a set of credentials to a service provider he registers the presentation.

👉 When a service provider receives the set of credentials registers that activity.

👉 When an issuer revokes a credential he updates the status.

👉 When a user wants to revoke the use of a credential by a service provider, he registers that in the presentation status.

All this activity is stored in the Blockchain network 🥅, using hashes of the credentials.

For companies there’s no privacy problem. But if the credentials of a user leak a third party may calculate the hashes and a malicious third party could gain knowledge of the user activities. This is because all the operations in the Blockchain are signed by him.

That’s why having multiple DIDs is also useful for privacy, because each transaction would be signed by a different account. And also because data protection agencies may not consider the DID as personal data.

The transaction privacy case ⌨️

In this case we will see the case of a company participating in a Blockchain network either public or private.

This company is using a Blockchain based application along with some other companies. They share some information 📇 to each other for the common benefit but each one does not want to share more than the minimum necessary.

In this case if one of the companies does many transactions (and this means writing something, not queries) the other companies will know of their market share, the volume of operations. And even if the is no public link between the company and its account at the end its activity, the operation data, the volume, the market share or a data leak may disclosure witch company it is and the application will not respect privacy anymore.

HDWallets.

🔔 This is easy to solve: the companies using this application, instead of using always the same account, could change the account from time to time. Or simply use HDWallets for organizing different keys and use a different one for each transaction.

The same way we selected “m/1037171” for the identity case now we can choose “m/15014710” for transaction isolation purposes. And the use the following levels of derivations for the use case (invoices, digital asset delivery, document notarization…), the item itself and so on. For example “m/15014710/1/3975925” could mean:

🔹 15014710: Purpose: transaction isolation

🔹 1: Use case: invoices

🔹 3975925: a number associated with each single invoice

This will generate a unique key for signing 🖋️ the transaction related to invoice 3975925, and will be used just for that invoice.

This derivation will be known only for each company and the levels and numbering can be freely chosen, no restrictions, there’s no need of being consecutive or whatever (except that each number must be lower than 2^ 31 -1).

So that’s it: using a different key/account per transaction (or maybe sometime twice if you need to amend some previous transaction) will give you privacy by isolation.

HDWallets.

Two risks to have in mind

There are two risks to have in mind when dealing with privacy: one is the source of the token used for paying gas and the other is leaking a private key.

1️⃣ Where the gas did came from? Gas, Eth and the ERC20 case.

If the network you are using does require the use of gas and it is not free you can face a privacy leaking problem.

Whenever you use a different account from an HDWallet for signing a transaction the gas fee has to be payed, from the balance of that account. So you have to send gas to that address in advance.

If you do it from your funds the transaction is recorded and then all your brand new accounts will be linked to you via your main gas account. This can be avoided using a gas station service where you company pays in advance for gas using off-chain methods and prior to the transaction you order gas to be server to that new address. Or you may use a relay that will execute the transaction in your name. In both cases the gas station or the relayer will know the relationship between the company and the accounts, but at leasts the information won’t be in the Blockchain network.

This is not a trivial discussion you should address in your project development IF you use gas dependent networks and it the gas is not coming from a free faucet.

The same applies to ERC20 or any other kind of fungible assets that you want in a consolidated position (gas = Eth = balance in your account, is a fungible asset). If you want to consolidate a balance of whatever asset, you have basically three options:

✔️ Not using HDWallets or using a single wallet for that asset.

✔️ Using HDWallets but at a given point transfer all the assets to a single account. This, due to KYT (Know your transaction) techniques will group all those accounts under a single owner and therefore privacy would be diminished.

✔️ Use a third party aggregator, sending all the assets there and then requesting a withdrawal. This has a trust requirement on that third party that will know the link between those accounts (but not others we may generate).

🙌 Fortunately that does not apply so heavily with non-fungible tokens (NFTs like ERC721) because they will unlikely be grouped together and can be handled with different accounts. Even if some of them are, for example, clothing 🧥 and accessories 🎀 for your metaverse avatar, there shouldn’t be on-chain link between it would be needed to proof to that metaverse their ownership (similar as we did with VCs) and only the metaverse site would know that those belong to you… or where lent to you.

2️⃣ What happens if one of the private keys is leaked?

Hopefully it is not your root key that should be kept under the strictest security 🔒, in that case there’s no solution to the privacy problem with existing transaction. Just change the key, move the funds if you can and carry on.

If it is an intermediate key in the derivation tree 🌴 there could be potentially a privacy problem due to up-the-tree escalation. With a private key it is possible to do a force brute attack to gain the knowledge of an upper level in the derivation path. In this case what we suggest is the use of firewalls: some derivation levels that will never be used and form a way to prevent reaching keys that impact you.

Each derivation level has 2^31 different key so an attacker knowing the key at level “n” will need to some complex calculations 2^31 times to gain knowledge of level n-1. 🚨 Not impossible. But if we keep adding levels 📈, those 2^31 will become 2^62, 93 or greater. NIST states in its document at page 54, table 4 that minimum acceptable security strength would be 128 bits for the near future, so it would require 5 levels (5*31 = 155 bits) of isolation.

Then a complete derivation path for transaction isolation, from the root, would look like:

“m/15014710/1/f/f/f/f/f/3975925”

Where the “f” are the firewall levels, some random derivations that will never be used (so the private keys will never leak 📢) and from “3975925” there would be the levels where the user would organize the different transactions.

The benefit of security: the login example 🧑‍💻

Apart from using HDWallets in DIDs for credentials or transaction isolation they could be used for other purposes. One of the main examples is that it provides a great option for login/ mutual authentication.

Currently most of the login processes work the same: enter user & password over a secure connection. The server on the other side checks them and allows you to enter the site. This is sometimes enhanced using Multi Factor Authentication (like sending a notification or a text message to your phone).

The problem is well known and there’s even a site to check if the password of my accounts, stored by a website, has been stolen anytime. This is the problem with classical authentication methods.

HDWallets.

The benefit of using private-public key schemes for login is clear: my private key never leaves my device 📱, instead the server sends me a document 📄, a challenge, that I have to sign with my private key 🔏 and the server can check against my public key. This way I will be identified.

Do you remember that we mention the benefits of credential distribution?

We can use the same format for a safer, more private login. We can create different Private/Public keys for each different site we want to login so is that site is hacked I will completely safe 😁. Not only the hackers will not know my private key but they cannot guess if I am customer to another site because I will be using a different set of keys for each site.

This has been already proposed at some other initiatives like SIWE (Sign in with Ethereum) or Apple, Google and Microsoft FIDO alliance.

Finally 🏁

So it seems that HDWallets have great benefits towards having your credentials well organized, enhancing the privacy of your company transactions and authentication purposes, indeed it is the base for Alastria EPIC, the incoming version of SSI from Alastria.

Acknowledgements

I’d really like to thank Coty de Monteverde (Head of Blockchain CoE) and Roberto García (Head of engineering at Blockchain CoE) for their support for this R+D effort.

Also Przemyslaw Siemion for his help with the mathematics and cryptography validating the model.

Before you go:

Clap if you liked it 👏, comment and share this article to reach more community 🧞.

Would you like to be part of our technology project? Find our open vacancies worldwide here 👉 https://www.betechwithsantander.com/en/home

--

--