Tracing Cryptonote ring signatures using external metadata
Background reading — “Why metadata matters?” by EFF.
What is an attack?
As we are heading into more technical topics of Cryptonote tracking, I will be using term “attack” for anything that can be used to deanonymise ring signatures. It is important to know that, similar to most practical attacks against cryptography, those attacks don’t tackle cryptography directly. Instead they rely on how transactions and people making the transactions interact. In fact this attack is a simple application of textbook forensic techniques to ring signatures.
As no single attack that I’m aware of is an “I win” button, it is important to understand their relative strengths and weaknesses.
What are the general properties of metadata analysis?
A single expression that I would use to describe is “churn killer”. Since the anonymity set provided by a ring signature is fairly small, a very naive and stupid advice would be “just send money to yourself a couple times”. Metadata attack turns churning into incriminating evidence in a scenario where you are trying to prove beyond reasonable doubt that a transaction occurred between Alice and Bob.
Another interesting property of metadata analysis is that larger ring sizes are more incriminating. It can be only countered with smarter output selection. For one such idea, see section 6.2 here.
How does it work in detail?
Let’s say that Bob is running a shop. The state does not like what Bob is selling, so one beautiful morning Bob finds himself in his underwear in a police station. Bob decides to cooperate.
Alice doesn’t know Bob, and only transacted with him once. Alice thought she is smart. She knew that she would be in very hot water if she just sent money from an exchange to Bob due to Knacc attack, so she thought she will send the money a few times to herself to increase the anonymity set exponentially.
Unfortunately for Alice, Bob has given her delivery address as a part of his plea deal. This is not a court-grade evidence at this point. People make all sorts of stuff up if you give them an incentive (Google Rafid Ahmed Alwan al-Janabi). Next step the authorities take is to run Bob’s list against exchanges to see who has withdrawn money. This is still not evidence — Bob could be simply framing a person he doesn’t like. We need a proof that the money that Alice withdrew ended up with Bob.
Unfortunately for Alice, her ISP is obliged by law to log her activity. This makes it trivial to construct intervals when she was online.
Now we switch scenes to a courtroom. Alice is in the dock. Foxtrot, a computer forensics expert witness sits in the witness box.
F: As I explained in my report, presented to the Court as Evidence Exhibit A, Alice has used a protocol that allowed her to pick any number of fake inputs along with one real input to obscure her transaction.
QC: The number of possible fakes is in thousands. How can the Jury be sure that she is the one real actor and not 999 or more innocent bystanders?
F: As you can see in Appendix A to my report, Evidence Exhibit B, Alice picked two other innocent bystanders for each of her series of transactions. On level one, the most recent, there is only one output that was made while her ISP reported that she was online. On the next level there is also only one out of three — this means chance of both events occurring at the same time is much smaller. On the third level the chance that events “Alice is online” and “An output is generated” not correlated is very small. We very quickly arrive at a situation where it is a mathematical certainty that Alice sent money to Bob.
Evidence Exhibit B
What can be done to prevent it?
First of all let’s get one thing out of the way. No amount of real-time traffic obfuscation will put you in the clear here. It does not address the root issue — that your activity and transaction happening are temporally correlated.
In Monero you are double-screwed. It has a non-constant fee that will leak information on when you signed the transaction, even if you delay its broadcast.
Smarter output selection (see the paper I linked earlier) will help a lot. That favours very heavily outputs that are also temporally correlated to the real output.
Finally the real solution is to have protocol level way whereby the broadcast can be delayed while keeping the transaction anonymous.
We will be implementing both approaches in Ryo.