Diogenes Octopus* : Playing Red Team for Eth2.0 VDF

Part 1

Omer Shlomovits

Published in

Zengo Wallet

10 min readJun 18, 2020

🐙Background🐙

Diogenes (paper) is an ambitious project to design and run a “ceremony” to generate an RSA modulus. The Ceremony is a multiparty computation (MPC) protocol of an unprecedented scale. Once completed, the generated modulus will be incorporated in a VDF protocol, which will be used as part of an unbiased random beacon in the Eth2.0 blockchain.

The Ligero Inc team leads the project. LigeroRSA is the code repository implementing Diogenes. It is 21,000 c++ lines of code, not including external libraries.

The VDF Alliance and the Ethereum Foundation have requested us to review Diogenes as a real-world cryptographic system. We have teamed up with prof. Claudio Orlandi and prof. Peter Scholl for this project. In this blog post, we illustrate one attack vector and explain how it could have been exploited to achieve a backdoor into Ethereum 2.0’s VDF. While the security proof of the core protocol is sound, the attack becomes possible when considering a set of optimizations we describe below.

EF’s Justin Drake slide from Devcon IV. We can estimate that the Diogenes project is in the works for at least 18 months.

The Review Process

Our team is only a small piece of the red team assembled by the EF/VDF-A. All of the reviewers contributed to the discovery described in this blog. The review team communicates daily, looking at the code, paper, and other supplementary material. We meet weekly with the Ligero team to discuss potential issues. There is no rock left unturned. The Diogenes paper and its respective LigeroRSA code details the core protocol with basic security guarantees. Our goal here is educational as we believe the attack we illustrate shows how delicate and complex this project is. It is also an excellent opportunity to appreciate the world-class teams working on this problem — both blue and red.

The Problem Statement

An RSA modulus N is the product of two primes, p,q. In classical RSA key generation, the primes p and q are basically the private key while N is the public key. The goal of Diogenes is to generate a modulus N such that no one knows its prime factors p,q.

Solution Requirements (high level)

A ceremony (one time run) protocol that allows n>1000 parties (participants) to generate N jointly. It’s required that even if an adversary controls n-1 parties, it would still be impossible for the adversary to learn the secrets p,q. The protocol must be “verifiable”, which means that its correctness can be tested by any outside observer/auditor with access to the ceremony transcript.

Worst-Case Scenario

An adversary with control over a small number of participants (<<1000) can learn the secret prime factors p,q of N without raising any alarms with all other parties/observers. This kind of backdoor is what we illustrate here.

🐙The Ligero Proposed System and Model🐙

Computing RSA moduli in a distributed way is a problem studied for at least 20 years in the academic literature. However, prior solutions are not practical at a large scale and with malicious security. Needless to say that none of the previous protocols run in a production-grade environment. Ligero’s Diogenes provides multiple breakthroughs. We refer interested readers to the introduction of the paper and\or the excellent SBC20 talk: https://youtu.be/BXLcKQ6fLsU?t=1490

Here, we will just mention the relevant ideas:

Coordinator: To save on the heavy communication between low-resource parties, Diogenes works in a star topology with a Coordinator party at the center. The coordinator communicates with each party and only makes public computations (can be done by anyone) — for example, aggregating values from all parties and sending back the sum.
Homomorphic encryption: Homomorphic encryption provides a way to compute over encrypted data. In Diogenes, this is useful to let parties send encrypted values to the coordinator such that there is no risk of data leaking. Yet, the coordinator can aggregate and compute over the data.
Commit and Prove: To make sure cheating parties get caught. A zero-knowledge proof is used to prove to some verifier that all parties followed the protocol without exposing any of the parties’ secret data. It’s called Commit & Prove because, at the beginning of the protocol, the parties cryptographically commit to all the randomness they plan to use. In the end, they prove that the committed randomness was indeed used (without exposing the randomness).

The Protocol in a Nutshell

We give a birds-eye view of the actual protocol. The protocol progresses through a total of twelve rounds between each party and the coordinator.

The parties jointly generate an encryption key for the homomorphic encryption scheme.
Using the encryption scheme for a multiplication protocol, the parties generate candidates for N.
The parties run tests on the candidates to filter the candidates that are not a product of two primes.
The parties take the first candidate from what’s left after filtering, and with a zero-knowledge proof demonstrate all tests passed according to the protocol.

Cryptographic Commitment

In our context, a commitment is a cryptographic tool. We require it to be “binding” such that once a value is committed it cannot be changed, and “hiding”, such that it is impossible to distinguish between two commitments.

Coin Toss Sub-Protocol

This is a simple protocol, yet it’s crucial to the understanding of our attack. A coin toss protocol is a distributed protocol to generate a random string. In its simplest form, it can be done by letting each party sample a random string, sending it to the coordinator, and then having the coordinator output a sum of all random strings. However, In a star topology as we have in our case, a “rushing” adversary controlling a single party and the coordinator, can decide to wait until n-1 random strings get sent to the coordinator. The adversary can then choose a random-looking string, dependent on all other parties’ random strings, such that the result looks completely random and cannot be distinguished from an actual random string.

To overcome this, we require the parties to cryptographically commit to their random strings and later send them. It can be verified that each party’s random string is indeed in accordance with the previously submitted commitment.

Optimizations

There are many tricks at the protocol and the code level to make this process more efficient. In our attack, we take advantage of some of these optimizations. Here we describe only the relevant ones.

When implementing the Commit & Prove, the parties don’t need to commit to all their randomness. As shown in the paper, it’s enough to commit to only a portion and then prove constraints on how the computation was performed.
All required commitments for coin tosses in the protocol get committed at the first step of the protocol. Each time a coin toss is needed, each party sends the designated random string, which will get verified against the commitment from the first round. Since this counts as a verification for the coin toss, the randomness used is not part of the Commit & Prove.

The Attack

We now have all the ingredients to describe a potential attack.

The first coin toss used in the protocol is to generate a random number, which is then used as a part of the public key in the homomorphic encryption scheme. As all protocol operations are performed on ciphertexts, the key generation is the first step. To be specific, we are dealing with “threshold” key generation. Its purpose is to generate a global public key that can be used by all parties. However, for decryption, all parties must collaborate. If we sketch the relevant parts in the protocol up until this point:

Registration: As part of registration, each party sends a hash-based commitment for its local partial public key aᵢ.
Coordinator Ack: The coordinator saves hash(aᵢ) and sends back ACK to the party.
Keygen start: The next message from the party is part of the key generation and specifically sending (and de-committing) to aᵢ.
Coordinator verify and compute key: The coordinator aggregates all aᵢ’s. Checking correctness of de-commitment from each party. The coordinator replies with the aggregated value “a”.

The attack assumes a malicious party j that controls the coordinator as well.

The attacker:

Waits for all n-1 honest parties to commit to aᵢ (message 1 above).
Sends the ACK message (message 2 above).
Receives from n-1 parties the de-commit to aᵢ (message 3 above).
Picks special aⱼ (we discuss later how to pick).
Registers hash(aⱼ). Use aⱼ as de-commitment from party j.

The attacker can now control what will be the encryption key that is going to be used for the entire protocol. Here’s how this can be used to place a backdoor.

🐙Backdoor!🐙

One of the hardest parts of attacking real-world systems is finding an exploitation. In our case, the public key can be manipulated, but then what? We must first understand the details of the encryption scheme used in the protocol.

This scheme is called Ring-LWE (RLWE) where the public key is a pair (a,b) and the private key is a secret s such that: b= as + e, where e is a small noise term. To encrypt a message m, we sample small noise elements x,y,z and compute: (c₁,c₂) = (a·x+y, b·x+z+m)

Decryption is then carried out given the secret s by computing: c₂ — s·c₁ and recovering the message (leaving only a small noise that can be rounded). Now, for simplicity, let’s see what happens if the attacker picks a=0. In this case: (c₁,c₂) = (y, e + z + m). Therefore the message could be extracted directly from c₂ without knowing the secret key. The problem with picking a=0 is that “a” is a public value. Any party/observer that will look at the transcript will catch this issue and probably never use the resulting modulus N.

To make this go undetected, we need to make “a” look random. Luckily, someone found a way to do just that. A 2015 paper described how to engineer “a” (see section 3 there) with a technique based on the NTRU cryptosystem. It’s complicated, so we’ll skip the exact details. However, for completeness, we will say that “a” should be picked as a multiplication of two small numbers: a= f· g^{-1}. Such an “a” will be random-looking to anyone else but the party who generated it.

🐙Why This is Possible🐙

It is essential to understand why this attack vector is made possible. As we mentioned above, this is a real-world protocol on a scale of >1000 participants. Therefore, many design choices were inspired by performance bottlenecks. One of which is to not include the randomness for the coin toss of the public key in the Commit & Prove. Another one is to assume the coordinator is playing honestly as it can only operate on public values. Finally, no bulletin-board (blockchain) was used to record the messages.

Our attack would be possible simply because the coordinator can change the order of messages within a round, undetected. It is part of a broader issue in the coordinator model that can delay messages for parties arbitrarily.

We have raised the issue to the Ligero team and they provided us with the following clarification:

The basic model described in the paper and the implementation assumes the coordinator is “semi-honest” (i.e. follows the protocol as prescribed, but can share information with an adversary that has corrupted the participants). The coordinator’s role in the computation is to relay messages and perform public (cost-saving) operations. In particular, the security of the protocol does not require any information at the coordinator’s end to be kept confidential. It is pointed out in the paper that the protocol’s security guarantees can be extended to tolerate an “untrusted” coordinator (that can deviate from the prescribed protocol) if the public operations by the coordinator are “verified”. This blog post describes why it is crucial to tolerate an untrusted coordinator by demonstrating a concrete attack.

There are inefficient solutions that protect from this attack, requiring sending all the messages to a bulletin board such as a blockchain or using zero knowledge proofs to prove correctness. We are jointly working on possible efficient solutions to the paper and code that improve the security of the system.

Concluding Thoughts

While attacks of these types were explicitly outside of the threat model Ligero designed the protocol under, this practical example of an exploit shows the importance of considering all possible threats in high-value MPC ceremonies. For example, If the execution of this version of the protocol had occurred, it would have been enough for one of the >1000 participants to try to access the innocent coordinator (perhaps to operate it). By changing the order of messages from the first round, this attacker could have injected a backdoor that enabled them (and only them) to read all the protocol’s encrypted messages. For the rest, the protocol would terminate successfully, making the output modulus N the modulus used in the Ethereum 2.0 random beacon chain. The attacker, able to decrypt all communication, learns all the local inputs of the other participants and can compute the p,q, the prime factors of N.

Acknowledgements 🙏

We wish to thank the Ethereum Foundation, VDF Alliance, the Ligero team, Dmitry Khovratovich, Bernardo David, Riad Wahby, Mary Maller, Claudio Orlandi, and Peter Scholl.

Meet us at the next BlackHatUSA2020 event where I will be giving a joint talk with JP titled: Multiple Bugs in Multi-Party Computation: Breaking Cryptocurrency’s Strongest Wallets

* The meaning of the “Octopus” in the name is two-fold: (1) The coordinator is an octopus controlling all his arms (parties). (2) According to some theory, Diogenes died from eating an octopus.