Privacy surrounding the Blockchain
The Network Layer
In an earlier post, I discussed what it means to have privacy on the blockchain. Obscuring the sender, recipient and the amounts on the chain with cryptography go a long way in providing much needed privacy to the users.
The blockchain is a very public and permanent record of transactions and events. It is important to make sure that sensitive or revealing information is not stored on the blockchain. It’s better to only store pseudonymous information, and best is to store only anonymous information that cannot be traced back to the user. This is a well studied problem, and cryptocurrencies like Monero, Zcash and Grin aim to preserve user privacy at the blockchain layer.
What’s not as obvious however, is what information exists and is transmitted before the transactions are etched into the blockchain forever. The network layer can reduce privacy substantially if precautions are not taken. Information comes in many forms and is necessarily transmitted between computers before it reaches the blockchain. This information may include signed or unsigned transactions, unmixed transactions, transaction metadata, requests for payment, requests for balance information, along with many other types of data that are inherent in blockchain protocols.
To have privacy is to have control over who has access to the data or information we wish to protect. Real privacy is a difficult undertaking as user action is potentially observable at several places.
The client is the origination for your network traffic. The client is a device that has an IP address associated with it so that it may communicate with other devices on the internet. Association of your activity with your IP address reduces your anonymity set to a relatively small collection of possible users, and could even identify you as an individual if it is not a shared IP address.
Safeguarding your client devices requires knowing your device’s origins and avoiding the installation of malware or any software that may contain viruses or backdoors. Brand new devices or devices that are from a trusted source and wiped clean are your best bet.
The server is the destination for communication. Blockchains are usually operated as peer-to-peer networks, and there are no true servers in these networks. Nevertheless, the client on the other end may be thought of as a server since it is the counterparty for all protocol communication.
Servers have access to all data communicated to it, so it is important to design protocols so that servers have only necessary information and nothing extra. Sensitive information is ideally never communicated or needed when utilizing functionality of a cryptoasset. The administrator of the server may be logging all kinds of information, so it is best if they know little or nothing about you to limit their ability to tie your activity to your identity, compromising your privacy.
Many people think of the internet as one large network. This is subtly misleading, as the internet is really a vast collection of networks that are all linked together via routers at peering exchanges. Your computer’s communication will hop through many networks on the way to the destination and the administrators and operators of each one will get a chance to observe your activity.
The client and the server are each likely a part of a Local Area Network or LAN. From the LAN they connect to an Internet Service Provider or ISP. The ISP will typically have peering relationships with other ISPs and traffic may flow through several ISPs, each getting a chance to profile the activity.
Protecting content with encryption
One of the most powerful precautions for protecting sensitive data is to encrypt it. Encryption takes what’s known as plain text data along with a key and uses it to create ciphertext. Ciphertext can only be reversed to the plain text with access to the intended key. The key used to encrypt may be the same as the key used to decrypt. This is known as symmetric encryption. The alternative is to have separate public and private keys. The public key is used to encrypt data and the private key reverses the encryption. This is more expensive computationally, but has the advantage that a key does not need to be shared.
Many forms of encryption are available but the most common standard today is known as Transport Layer Security or TLS. TLS has superseded SSL or Secure Socket Layer which is now deprecated. The latest version is TLS 1.3 and includes many security improvements, garnering praise from the EFF. The main advantage of TLS is a number of vulnerability fixes as well as support for forward secrecy. One of the main vulnerabilities in SSL is known as the downgrade attack which weakens the protection offered by abandoning the newer protocol version in favor of an older less secure version. This is allowed in SSL for backwards compatibility reasons, but is a known flaw. Forward secrecy uses unique session keys to ensure communication will remain private even if the server keys are compromised at a later date.
Encryption limits what intermediaries can learn from handling your internet traffic. When a system is encrypted, observers do not have access to the actual data being transmitted. Instead, they have access to metadata such as timestamps and size and pattern of communication. If enough metadata and evidence is collected, privacy is reduced and may be compromised.
Defence against metadata and pattern analysis
As mentioned, your IP address is a sensitive piece of information when associated with your online communication. For many of us, this IP address does not change very often. It is either explicitly static or infrequently changed. VPNs are a popular solution for decoupling activity with an IP address.
A VPN is a service provided by a special kind of server. It is used to relay communication on behalf of the user. The user redirects all communication through the VPN, replacing their true client IP with the IP of the VPN server. To all of the user’s counterparties, it looks as if the VPN is the originator of the traffic. This removes the client’s IP address from the equation which is a boon for privacy. Furthermore, if it is a well trafficked VPN server, it may have many users which further obscures who performed the activity improving privacy.
It is important to note that the VPN does have access to all activity happening on its servers and may be logging this activity. Depending on the privacy needs, it may be best to rotate VPNs frequently, or to use a more advanced solution altogether.
Tor, short for The Onion Router, was developed by the Navy in 2002. The protocol is dedicated to protecting individuals’ privacy. Tor’s goal is to prevent servers and onlookers from learning the user’s geographic location or doing network surveillance and traffic analysis. Tor consists of several exit nodes which are accessible from the ordinary internet, as well as relayers which act as intermediaries within the Tor protocol to route the traffic.
The onion in the name of the protocol is a metaphor alluding to the technique of wrapping up the data in multiple layers of encryption. This ball of data is then passed through several relayers. Each relayer decrypts the received data, peeling the onion, to find more encrypted data and the address of the next relayer. This process repeats several times until it arrives at the destination in its original form.
By introducing a sequence of relayers, each with only knowledge of the immediate prior and immediate next node, we’ve eliminated the ability for relayers to interpret what is happening, and we’ve also removed the ability for the server to know precisely who it is communicating with.
While Tor’s privacy guarantees are strong, it is imperfect, and depending on the uniqueness and timing of the activity, it may be possible to deanonymize activity, especially if enough exit nodes are being surveilled.
Tor offers the ability to connect to the public internet privately, but also has the ability to launch websites that are only accessible through Tor via .onion addresses. Perhaps the most famous Tor site of all was the drug marketplace, Silk Road, founded by Ross Ulbricht and shut down in 2013.
I2P is quite similar to Tor with a couple of important differences. Tor has the user select a series of relayers that become a circuit. This circuit acts as the route for all traffic during the Tor session. In contrast, I2P does routing at the packet level. Network traffic is divided up into packets and each packet in the I2P protocol is routed independently.
It’s also important to note that I2P acts as a self contained network and does not allow users to access the rest of the internet. Only services and applications specifically developed for I2P can be accessed with the I2P network. This is similar to .onion sites on Tor that are only accessible within the Tor network.
Kovri is a C++ implementation of I2P that is being built specifically for the cryptocurrency Monero. The goal is to integrate it into Monero so that all transactions are relayed through the Kovri network, removing the ability for observers to trace transactions back to their originating IP address.
Dandelion is a protocol specifically designed to mask the origin of a crypto transaction. Rather than broadcast to the entire network at once, which may allow spy nodes to see the IP address, Dandelion splits propagation into two phases known as Stem and Fluff (or Anonymity and Spreading as seen in this diagram)
During the stem phase, transactions are passed forward to just one recipient a random number of times. Each node decides at random whether to pass the transaction forward continuing the stem phase, or to enter the fluff phase. Given that the number of hops is not known, it is difficult for each node to discern whether the transaction actually originated at the previous hop or was passed along from one or more prior hops. Only after the transaction has been passed forward a number of times during the stem phase, will the fluff phase begin. The fluff phase broadcasts the transaction to the wider network, leaving the true origin hidden in the mix of stem nodes.
The Dandelion protocol, which is has been updated and now called Dandelion++, is being incorporated into MimbleWimble currencies Grin and Beam and may eventually be adapted for Bitcoin and other cryptocurrencies.
Path to the blockchain
Transactions destined for the blockchain have a long journey. First the user creates intent. The most basic intent would consist of an amount and a recipient address. More complex intent may involve smart contract logic or multiple recipients. Depending on the system, more complex transactions may be harder to achieve privately.
From intent, the user generates a signed transaction using the private keys corresponding to their funds. Depending on the wallet software, this may generate observable network traffic potentially compromising user privacy. This is the reason that web wallets or light clients are considered harmful to privacy by crypto experts. Full nodes are able to generate and sign transactions without reaching out to the world for details needed during the creation process.
Once a transaction is created, it needs to be broadcast to the world. If it is going to be joined with other transactions before making it to the blockchain, it is important to think about who has access to the component transactions that are used to generate the composite transaction. Grin is an example of a system where transactions are combined before being written to the blockchain. If it weren’t for Dandelion, nodes observing the original incoming transactions would have full access to those individual transactions and may be able to use that information to discern sender and recipient within transactions.
As more and more of our lives migrate to the digital realm, it is vitally important to have control over our own data to protect ourselves. As we have seen here, if privacy is to be preserved, doing so at the blockchain layer is necessary, but not sufficient. Steps must be taken to ensure privacy is not compromised on the path to the blockchain. In the coming years, we can expect breakthroughs in both surveillance and privacy in what is sure to remain a game of cat and mouse for the indefinite future.
Disclaimer: Jordan Clifford is a managing director of Scalar Capital which holds positions in some, but not all, of the aforementioned assets. This post is not investment or legal advice.