Mechanics of privacy coins: Part I

Published in

Coinmonks

7 min readDec 1, 2018

Quentin Massys. The Tax Collectors, ca. 1500

This is Part I of a series of articles taking a deeper look into the internals of the privacy protocols CoinJoin(Bitcoin), Monero, ZCash, and Mimblewimble. The goal is to give a conceptual and a bit more detailed explanation of how money transfers stored in a public ledger can be made anonymous yet verifiable by any interesting party — two seemingly contradicting requirements. This part explores CoinJoin — a privacy protocol that uses only plain vanilla Bitcoin transactions.

TL;DR

Bitcoin transactions transfer the right to spend a UTXO
Deanonymizing a few bitcoin addresses can lead to severe privacy leak since the majority of transactions can be traced back
At least 15% of the Bitcoin blockchain can be deanonymized
Bitcoin transactions may spend UTXOs belonging to different addresses in a single transaction
CoinJoin bundles output UTXOs from different parties into a single transaction, thus making the recipients undistinguishable
A major drawback of CoinJoin is that it requires another party to participate in the mixing transaction (albeit the parties may stay completely anonymous)

Back to basics: UTXOs

To begin with, let’s inspect in a bit more detail the inner working of Bitcoin UTXO-based transactions. The transaction structure and signature scheme we describe is specific to Bitcoin (and even more precisely, pre-SegWit version). Still, UTXO-based accounting is going to be our base model as it’s adopted by all the coins we are going to consider in due course.

Instead of keeping a list of accounts with balances (like in Ethereum), the Bitcoin ledger contains information on which address owns each UTXO (stands for unspent transaction output). When Alice transfers 1BTC to Bob, she creates a transaction with some UTXOs she owns, say utxo_1, utxo_2 , as inputs and two new UTXOs utxo_Bob and utxo_Alice as outputs. The value of utxo_Bob is 1BTC and utxo_Alice is the “change”.

When a mining node receives Alice’s transaction, it performs a straightforward validation check that

utxo_1 and utxo_2 had never been included in another transaction as inputs (i.e. they’re indeed Unspent TransaXion Outputs) and
the total value of the transaction inputs equals to the total value of the transaction outputs.

A trickier part is to ensure that only Bob has the right to spend utxo_Bob in the future. This is achieved by a clever mechanism of unlocking scripts attached to each UTXO upon creation. A script is a small program, written in a special non-Turing complete language with a small number of instructions. In order to include an UTXO as one of the inputs in a new transaction, one has to include some data so that the UTXO unlocking script returns True (Disclaimer: this is pre-SegWit state of affairs. modern Bitcoin transactions use a slightly different scheme. Old school non-SegWit transactions are still valid, though).

For example, the unlocking script (attached by Alice) of utxo_Bob can look like

<Bob's public key> OP_CHECKSIG

Note that it contains only public information, so Alice can attach this script to utxo_Bob since she knows Bob’s wallet address (which is, roughly speaking, his public key modulo hashing). In turn, Bob can verify that the he indeed is going to obtain the right to spend utxo_Bobafter the transaction is included in a block and the block is accepted. Omitting many technical details to be found e.g. here and here, the script above roughly dictates that in order to spend utxo_Bobas an input one should provide some additional data sigso that

return secp256k1.verify(sig, <Bob’s public key>, sha256(sha256(tx)))

evaluates to True. The point here is that only a person who knows Bob’s private key can produce sig. This fact is virtually guaranteed since sha256(sha256(tx)) is sufficiently random and ECDSA is sufficiently secure. For example, a hacker may try to find sig by combing through transactions previously signed by Bob. Such a naive attempt is doomed to fail — all hashes of the form sha256(sha256(tx))are different. On the other hand, it’s a routine task for Bob to produce sig using his private key and any sequence of bytes, in particular sha256(sha256(tx)). So without any direct communication with Alice (not even being online!) Bob can safely accept bitcoins to his publicly announced address and rest assured that nobody but him can spend the funds.

Bitcoin is secure but not anonymous

Let’s first describe potential anonymity leaks a blockchain forensics may exploit (in decreasing level of severity):

Match an address A with the real person/entity who controls the private key.
Provide evidence that the owner of A has transferred some money to B (perhaps through some intermediary transactions).
Provide evidence that the owner of an address A owns X amount of crypto (perhaps on multiple accounts).

Satoshi Nakamoto argued that since anyone can create a bitcoin address anonymously, 1) is not a serious privacy threat. It could have been true back to the times when bitcoins were obtained mostly by mining. Nowadays, due to ubiquitous KYC/AML requirements forced upon major cryptoexchanges, many bitcoin addresses can be linked to real identities. Even worse, by merely analysing publicly available Bitcointalk and Twitter profiles it’s possible to extract quite a lot of information about the address owners using data mining techniques. Combining with 2), it’s possible to trace surprisingly many transactions. Bad news for those who’d been thinking that DarkNet means that the law enforcement remains in the dark about the transaction counterparties (sorry, Dread Pirate Roberts). Bitfury claims to have been able to deanonymize no less than 16% of all bitcoin addresses.

Note that while 1) requires at least some data scarping, the information about 2), 3), 4) is completely open (for usual Bitcoin transactions) and freely available (and it’s often advertised as a feature of permissionless DLT).

CoinJoin

At least for Bitcoin transactions 1) greatly depends on 2). Indeed, by simply moving funds from account A to some other account B (or multiple accounts, which is recommended), one should be fairly safe if it’s hard to prove that such a fund transfer has even been made. But if all Bitcoin transactions are stored in the open by design, how is it then possible to hide money transfers?

The grounding idea (CoinJoin) was suggested as early as 2013 by Greg Maxwell. There are quite a few commercial privacy solutions for Bitcoin, here is a comprehensive and up-to-day review. Most of them implement the idea of CoinJoin in some form, mainly because it does not require any changes at the Bitcoin protocol level.

So the idea is as follows. Recall that a bitcoin transaction is a list of UTXOs: inputs (being spent) and outputs. The sender provides an unlocking script for each input UXTO and puts a lock to each UXTO in the output. A transaction is valid if

[RULE 1] all inputs are successfully unlocked

[RULE 2] the total sum of inputs equals to the total sum of outputs.

Not that these two rules say neither that all inputs should originate from the same address, nor that the outputs should be somehow connected. Thus, it is completely kosher to spend in a single transaction different UTXOs originating from unrelated addresses. Now assume that Alice wants to send 1BTC to Bob and at the same time Carol wants to send 1BTC Dave. Both Alice and Carol would like to cover their tracks, so instead of having to traceable transactions

tx1: A->B; tx2: C->D

they can form a single transaction

tx: {
  inputs: [
    {1BTC; <Alice' Sig>},
    {1BTC; <Carol's Sig>}, 
  ],
  outputs: [
    {1BTC; <Bob's PK>},
    {1BTC; <Dave's PK>}
  ] 
}

Now it’s impossible to say whether Alice transacts with Bob or Dave; same for Carol. The pseudo-code above is very far from being a valid Bitcoin transaction but conveys the trick of CoinJoin.

Greg Maxwell’s original explanation. Source

Drawbacks of CoinJoin

The major drawback of CoinJoin is the necessity to find a matching counterparty willing to participate in the “mixing” transaction. Without a sufficiently large number of participating addresses a clustering analysis may still reveal (or at least significantly narrow) the set of recipients.

Next, it is rarely the case that Alice sends to Bob a round number of bitcoins. Making a small payment may require multiple mixing transactions with different counterparties, thus increasing the overall complexity while reducing the anonymity set (i.e. the set of transactions indistinguishable from Alice’s transaction).

Both issues are addressed Monero, ZCash and MimbleWimble, though in substantially different ways. Leaving the details for the next parts of the series, here is a small teaser:

Monero and ZCash (employing quite different methods) use the very unintuitive fact that, roughly speaking, it is possible to validate both [RULE 1] and [RULE 2] knowing neither the public keys nor the amounts being transferred. Monero is essentially CoinJoin on cryptographic steroids (with blinded UTXO values and decoy counterparties). ZCash uses zk-proofs thus generating a larger anonymity set.
MimbleWimble is a completely different beast, even though the underlying cryptographic primitives are similar to Monero’s. MimbleWimble does not even have private keys and addresses in the usual meaning. Instead, a right to spend a concrete UTXO is realised in an intricate yet surprisingly efficient (storage- and processing- wise) way.

Disclaimer: all inaccuracies are solely mine (but I try hard to not bs).

If you liked the article follow me on Twitter: @dizhel where I occasionally post crypto stuff.