The Role of Dandelion Routing in “Breaking Mimblewimble’s Privacy Model”
--
Giulia Fanti (CMU), Andrew Miller (UIUC, IC3), Pramod Viswanath (UIUC, Trifecta)
Ivan Bogatyy recently wrote a post detailing a linkability attack on the Grin network (framed as a more general attack on the Mimblewimble privacy model), in which he was able to link the inputs and outputs of 96% of all transactions. This attack uses a supernode to connect to a large number of nodes (in this case, 200 out of 3,000 total nodes). This supernode collects relayed traffic from network nodes in order to determine the inputs and outputs of transactions before they are joined. In the attack, Bogatyy shows that Grin’s use of the Dandelion routing protocol does not significantly reduce his ability to link these keys.
This post is meant to clear up a few points about Dandelion.
What is Dandelion and how does it relate to this attack?
Dandelion is a routing protocol for cryptocurrency transactions designed to reduce the probability of a network observer linking a transaction to the sender’s IP address. Dandelion passes transactions along a randomized sequence of relay nodes (“stem phase”) before broadcasting (“fluff phase”), which hides the source IP address of a transaction’s sender.
Just to be very clear about Dandelion and its guarantees: the guarantees are not cryptographic but statistical, and they depend on the fraction of adversarial nodes in the network. In the case of a network adversary who corrupts a large fraction of nodes p, IP addresses can be linked to transactions with probability p. However, as we show in [1], no algorithm can do better (assuming the adversary can tell that two versions of the same transaction are indeed the same). So we can never expect a routing algorithm alone to provide perfect unlinkability between transactions and IP addresses. Notice that Bogatyy’s attack works in a slightly different adversarial model, where it sets up a single supernode. Dandelion is designed to be robust to this setting by forwarding messages in the stem phase only on outbound connections, or connections that the node itself originated. So even if a supernode connected to many nodes in the network, it would have relatively few nodes creating outbound connections to the supernode, thus reducing the likelihood of the supernode being on the randomized “stem” path.
Dandelion was not designed to prevent linkage of input and output transactions in Mimblewimble. Hence on its own, it cannot prevent Bogatyy or other attackers from inferring such pairs. This is an important distinction for practical reasons. IP addresses can be linked to physical location, and the goal of Dandelion is to protect these physical locations; not the virtual identities protected by public keys (or input-output transaction identifiers).
As we see in Bogatyy’s analysis, Dandelion does not prevent input-output linkability. Bogatyy states that Dandelion prevents a supernode from linking a transaction’s inputs and outputs if “two transactions both intersect in their Dandelion path before [the supernode sees] either of them”. However, Bogatyy rightly observes that this event is unlikely, because Dandelion relays do not wait for incoming transactions to aggregate them. Hence, it is often possible for the supernode to see a transaction before it is combined with other transactions, thus allowing it to determine the input-output relation of a transaction. It should be mentioned that this issue is well-known and was documented by the Grin developers.
What can we do about it?
Although sender-receiver linkability is not the problem Dandelion set out to solve, it is an important problem in its own right! We agree with Bogatyy that it is important to evaluate how we can solve this problem as well.
A first-order solution to this problem might require out-of-band aggregation of transactions prior to broadcasting them on the P2P network. This poses substantial logistical challenges and is unlikely to materialize.
That being said, it may be possible to extend Dandelion to address these linkability attacks by incorporating a mix-network-like [2] functionality into the relaying protocol. That is, Dandelion relays could wait to aggregate a critical mass of transactions before relaying. This would reduce the probability of the supernode observing transactions before they have been aggregated, at the expense of transaction broadcast latency. Grin currently does a version of this, but the degree of aggregation appears to be too mild at current traffic levels to provide large anonymity sets. This would likely improve over time as adoption grows. Of course, any such approach would still be susceptible to an adversary who manages to corrupt a substantial fraction of the network peers, which is generally a very difficult adversary to protect against.
Another possibility is to introduce decoy transactions as done in Beam. Roughly, their approach is to add fake decoy inputs and outputs during the stem phase so that if a supernode does receive the transaction, it is less obvious how to link inputs to outputs.
In both cases, understanding and analyzing the design space more formally is an interesting and relevant research question.
Concluding thoughts
Bogatyy’s post highlights an important point that is often overlooked: network privacy vulnerabilities can undermine privacy guarantees at the consensus layer. To the Grin (and Beam) developers’ credit, they acknowledged this point early on.
In addition, Bogatyy’s attack illustrates the benefits of co-designing networking protocols with associated consensus-layer protocols. In this case, Bogatyy’s attack was possible in part because Dandelion (which was originally designed to hide IP addresses) is also being used in part to hide input/output pairs. In other words, a system without Dandelion would be susceptible to the same attack, possibly with even higher accuracy rates. Adding Dandelion provided the intended protection against IP address inference, but limited protection against input-output linkability.
Does this mean that Mimblewimble is fundamentally flawed? Not really. First, Mimblewimble is still a useful tool for compression, and the attack and post only address its privacy properties. Second, even with regards to Mimblewimble’s privacy properties, it may be possible to co-design new P2P network architectures to protect against such attacks. Either way, more research is needed to answer this question.
Acknowledgments
The authors would like to acknowledge valuable feedback and suggestions from a number of individuals, including Ivan Bogatyy, Claire Le Goues, Daniel Lehnberg, Quentin Le Sceller, Heather Miller, Antioch Peverell, and Justine Sherry.
References
[1] S.B. Venkatakrishnan, G. Fanti, P. Viswanath, “Dandelion: Redesigning the Bitcoin Network for Anonymity”, ACM Sigmetrics 2017.
[2] D. Chaum, “Untraceable electronic mail, return addresses and digital pseudonyms”, Secure electronic voting, Springer, Boston, MA, 2003.