An empirical study of privacy, scalability, and latency of Nym mixnet
While working on Loopix, a mix network design that Nym adopts, we developed a discrete event mix network simulator to empirically measure the anonymity provided by our design. It is the first such simulator that allows evaluating the anonymity metrics (end to end entropy [2, 3] and sender-receiver third-party unlinkability ) given different network parameters, like network topology, number of users, sending rates, mixing delays, etc. This allows us to analyze how different parameters and setups impact network anonymity and performance.
In this blog post, we present some of the studies we performed to analyze the scalability, anonymity, and latency trends of the Nym mixnet, and compare it with other approaches like fixed cascade topologies or P2P networks.¹,² The source code of our simulator can be found here. This simulator is a part of on-going research and a full publication will be available soon!
Anonymity trilemma: Scalability, Anonymity, and Latency
In our analysis, we consider a 3-layered Nym mixnet, where each node has a capacity to process on average 1000 packets per second (this assumption is compatible with the current Nym testnet performance).
How do we measure anonymity?
Let’s start with explaining the metric we use to measure anonymity. The goal of anonymous communication networks is to hide the information who is communicating with whom by hiding the network metadata. Depending on the a priori knowledge and capabilities of the adversary we can use different metrics. In this blog post, we consider a global network eavesdropper who has no a priori knowledge. A common measure of anonymity in such a case is the anonymity set, which reflects the size of the set of other packets with which our message can be confused by the attacker. However, a global passive adversary eavesdropping on the network and all interactions within it can assign different probabilities for each outgoing packet being linked to the observed incoming packet. Different probabilities of different members of the anonymity set reveal a lot of information to the attacker. Therefore, we quantify the anonymity empirically by using the concept of Shannon entropy, an information-theoretic metric [2, 3]. Entropy is a measure of randomness, or in other words, the unpredictability of information in a random variable. The more likely an event is, the less information the variable contains. On the other hand, the increase in entropy is an increase in uncertainty. For example, the fact that I make a phone call every day at the same time contains very little information, thus the entropy is high. However, if I suddenly make several phone calls to a lawyer, that might signal that I need legal advice, thus that event would contain lots of information (and the entropy would be small). In terms of an anonymous network, the harder it is to link the packets to their source (i.e., the more confusing for the adversary are his observations), the higher the entropy.
Another metric to measure anonymity given that the adversary has some a priori knowledge is the sender-receiver unlinkability. We won’t cover it in this blog post, but you can read more about it here (see section 4.3).
Anonymity loves company, hence scalability is one of the key properties of a mix network.
Following the Loopix design, Nym arranges mix nodes into a Stratified topology, in which nodes are grouped into layers and each node in layer i is connected with each node in the previous and next layer. Such topology scales horizontally, thus the Nym mixnet scales easily by adding more nodes to support a large user base without a negative impact on the end-to-end latency. Critically, the privacy provided by Nym increases as the network grows, i.e., the more traffic flows into the network, the larger the anonymity:
Traditional Chaumian mixnets use single cascade topology and the batch-and-reorder mixing technique, thus they scale very poorly and impose large latency overhead when the volume of traffic grows. As our analysis shows, traditional Chaumian mixnets impose high latency overhead not only when the volume of traffic is high, but also when it is low since each node waits a long time to collect a full batch for processing. Thus, the users have no control or prediction on the end-to-end latency of their packets.
An alternative approach to single cascades is to use multiple parallel cascades (Chaum’s latest cmix design). Such an approach allows the network to scale, however, the anonymity is still always constant and limited to the fixed size of the batch. Thus, an increasing number of users does not increase the anonymity of the network. In contrast to multi cascades mixnets, Nym provides much better anonymity for large volumes of users, while providing orders of magnitude lower E2E latency.¹
In contrast, P2P networks, like HOPR, although scale easily, provide very poor anonymity despite an increasing number of users. As the empirical study shows, for the same number of users and end-to-end latency Nym provides much larger anonymity than networks with P2P topology.¹,²
As we saw, while the privacy offered by the Nym network increases when the user base grows, the P2P networks offer very poor (constant) anonymity even if the number of users increases.
One way to improve the level of anonymity is to deploy cover traffic. However, as our analysis shows the P2P networks like e.g., HOPR needs huge volumes of cover traffic to reach the level of anonymity offered by Nym:
For example, as the above plots show in a P2P network with 100K users, each user must produce 10 cover packets for each 1 real packet they send to achieve similar anonymity as the Nym mixnet achieves for 1.6K users without any cover traffic, for the same end to end latency! While the network scales, a P2P approach will need more and more cover traffic to provide any reasonable anonymity, requiring the peers to be deployed on powerful machines.
Nym mixnet deploys two types of cover traffic: loop cover traffic generated by users and loop cover traffic generated by mix nodes. You can read here why we need different types of cover traffic. Critically, in contrast to other designs, Nym uses tunable cover traffic, meaning that for small numbers of users, we can increase the number of cover packets to achieve desired anonymity. Once the number of users increases, and hence the volume of real packets flowing into the network, we can tune down the cover traffic as desired without sacrificing anonymity!
To illustrate that, we analyze how much cover traffic is needed in the Nym mixnet given changing numbers of users, assuming that our minimum desired entropy is approx 10.
As depicted, while for a small number of users we require high volumes of cover traffic, this amount quickly declines as the number of users increases. In our experiment for 5000 users, we already didn’t need any cover traffic to provide the desired anonymity (although we of course might want to use some cover traffic to boost the anonymity even more or hide our communication patterns). Imagine if the number of users grows to 1 million or 10 million! Therefore, the more users that join the Nym network, anonymity increases, and cover traffic becomes less necessary. This means the network scales privacy efficiently, unlike a simple-minded P2P design!
More traffic only speeds up the Nym mixnet
As we have shown earlier, thanks to the horizontal scalability Nym can support millions of users. As more users show up, you can scale up Nym by adding new mix nodes — just like you can scale up websites by adding new servers. This is crucial, as the increasing number of Nym users allows to tune down the mean parameter of the exponential distribution, and thus decrease the end-to-end latency, without sacrificing anonymity. As a result, the Nym mixnet becomes ‘faster’ as less per node mixing delay is required. In a nutshell, the more users join the network, not only is everyone more anonymous, but everyone benefits from higher speed!
Our simulator allows us to empirically measure the impact of various network parameters and user bases on the anonymity and performance of the Nym mixnet. As has been shown, Nym is the most scalable and anonymous mixnet design at the present moment. In order to get the concrete benefits of anonymity and speed produced by increasing usage, Nym provides a general-purpose mixnet that can support a large user base and a wide range of applications and services. P2P networks, although they scale well, offer very poor anonymity unless each peer generates large volumes of cover traffic or the mixing delays are high. However, this significantly limits their applicability to real-world networks. In addition to that, P2P networks are vulnerable to several attacks which significantly undermine users’ privacy.
This simulator code is a breakthrough as it allows researchers to test a wide variety of parameters across different mixnet designs, and produce empirical results that can validate or invalidate their intuitions as well as be used to create a “real world” mixnet with parameters that are acceptable to end-users. In the Nym testnet, we will be using this simulator to fine-tune mixing delays and cover traffic over the next few months, in order to get end-users the best possible anonymity and performance. Without parameters set correctly via simulation and real-world user traffic even mix nets based on the Loopix design, such as the codebase Katzenpost -as well as its fork - offer questionable privacy to users.
 “The Loopix Anonymity System“ by A.M. Piotrowska et al.
 “Towards an Information Theoretic Metric for Anonymity” by A. Serjantov and G. Danezis
 “Towards measuring anonymity” by C. Diaz et al.
¹ About the experiments: To make our analysis fair we assume the following:
I) In the simulated P2P networks we assume that each peer has a similar processing capacity as the Nym testnet nodes (i.e., single node processes ~1000 packets per second). Moreover, we assume that the simulated P2P network also uses the Sphinx cryptographic packet format for onion encryption and the same mixing technique. However, we stress that Nym mixnet is the only currently developed mix network design that combines the Sphinx packet format and the continuous mixing technique (as proposed by the Loopix design). Other currently developed systems use different onion encryption schemes and/or mixing techniques.
II) For the batch-and-reorder technique we assume the batch size of 1000 packets. We also note that the cmix/Elixir design does not use the Sphinx cryptographic packet format to onion encrypt the packets, but a different scheme.
² We assume a fully connected P2P network. In reality, such connectivity is infeasible and as the network scales up it becomes more sparse due to limits on the number of connections that each peer maintains simultaneously. Thus, the actual anonymity will be even weaker and various attacks can be deployed to trivially de-anonymize the users, even by much weaker adversaries than the global passive adversary.