How to — And How Not To — Use Data Availability Committees
By Steven Goldfeder & Daniel Goldman
One of the biggest challenges in the design space of scalable blockchain systems is the need to guarantee data availability. If some actor can advance the state of the system and publish the resulting state hash but withhold the underlying transaction data, then this essentially locks out other participants who cannot interpret or further update the state. Indeed, even in systems in which one must prove that state update hashes are valid, this proof will not be very useful without the underlying data — you may be guaranteed it’s a hash corresponding to some valid state, but without the data, the system will be essentially bricked.
The simplest way to avoid the data availability problem is to post all of the data on-chain, so the blockchain itself guarantees the availability of that data. This is of course the approach taken by Layer 1 systems like Ethereum, but is also the defining factor of Rollups. Rollup protocols (both the optimistic variant like Arbitrum Rollup and ZK-Rollup) put all of the underlying data on-chain. While this avoids the data availability problem entirely, it also reduces the scaling potential, as the rollup is bound by the data capacity of the base layer.
In an ideal world, we could keep the data off chain without having to sacrifice trustlessness; protocols with this property have been explored in the loosely defined design space known as Plasma. But alas, the world is not ideal, and it turns out that Plasma constructions have proved very complex, and include more fundamental limitations and trade-offs than was initially apparent.
An approach that’s been discussed more recently, mostly in the context of Starkware’s StarkEx, is to keep data off chain and simply rely on a trusted committee for availability. (If off-chain data + trustlessness gives us a Plasma chain, we might call off chain data with trust assumptions an “LED chain.”) The idea, implicitly, is to have laxer trust requirements than Plasma constructions while still retaining some benefits over simply trusting a committee entirely.
In this post, we’ll argue why the general notion of using committees for data availability and StarkEx’s instantiation specifically occupy an awkward point in the design space. In other words, it either makes sense to give the committee more power or less power, but a system with a committee just for data availability has all of the downsides of relying on a committee, without taking advantage of most of the benefits that a properly utilized committee could provide.
In our original Arbitrum paper, we proposed Arbitrum sidechains, which has a committee, and we’ll discuss our design and the lessons we learned about committee-based protocols.
(*would need K honest in order to support smart contracts)
(** Arbitrum Sidechains assume that there will never be a full challenge period in which all honest nodes are entirely unavailable.)
(*** We’re assuming a more traditional committee, as opposed to the particular instance described in StarkWare’s post)
1. Take Advantage of the Committee’s Power
In a protocol class that’s recently been dubbed “Validium”, a committee is trusted for data availability, and validity proofs of each state transition are verified on chain. These proofs ensure that any state transition is valid; the committee cannot, for example, directly drain all of the users’ funds to its own address. What the committee can do, however, is kill the chain’s liveness entirely; i.e., freeze all funds on the chain indefinitely. This situation, it turns out, is just about as bad as outright theft: with the looming threat of permanent liveness failure, the committee can extort users into paying a ransom; rational users are effectively coerced into giving up the majority of their funds.
In short, with or without on chain zero-knowledge proofs, a malicious, colluding committee is a complete disaster. So why bother with validity proofs at all?
In the Arbitrum Sidechains model, since we forgo these unnecessary constraints, we can then take advantage of the committee’s power. When the committee comes to consensus on a state update, we can consider it confirmed without having to interact with the base layer at all; this gives users the nice UX of fast finality, and lets us avoid the gas costs of interacting with the L1, as well as various other costs associated with other L2 protocols (i.e., proof generation in ZKP based systems and withdrawal delay in optimistic-style systems).
So long as we’re trusting a committee anyway, why not let it do what committees do well?
2. Have a backup plan
An important consideration for committee-based protocols is how they handle a liveness failure, a situation in which committee members go unresponsive and stop processing state updates. StarkEx describes the system transitioning into an emergency-state in the case of a liveness failure, in which the chain is halted and only processes withdrawals from that point onward.
With Arbitrum Sidechains, if the committee stops following the “plan A” of coming to consensus and updating the chain-state off-chain, a committee member can trigger “plan B.” In “plan B”, the sidechain seamlessly reverts to “rollup mode”, in which all data is posted on chain. This is more costly and slows the sidechain down to the latency of the base layer, but ensures that a data withholding attack is no longer possible. This fallback gives Arbitrum sidechains stronger guarantees that the chain can keep on marching forward. And once the committee is ready, the chain can transition seamlessly back to the fast “plan A” mode.
3. AnyTrust Committees
The threshold of committee members required to come to consensus comes down to a tradeoff between liveness and safety. A well known property of BFT consensus algorithms is that they require greater than 2/3 of committee members to be honest to remain safe.
Arbitrum Sidechains offer the option of operating under what we call an AnyTrust assumption; here, unanimous consent must be reached for a state update to occur, which means we’re only assuming that at least a single committee member is honest. In other words, even if in a committee of 12 members, 11 are dishonest and actively colluding with each other, a single honest member is enough to keep the rest from committing fraud. The downside here is that a single member is also enough to force the chain into the fallback mode discussed above; the chain doesn’t halt, it just goes into rollup-mode.
Once we accept that we’re within the design-space of trusting a fixed set of participants, the AnyTrust model minimizes trust given to the committee as much as possible.
Different offchain protocols offer their own pros and cons in terms of their built-in trust requirements. This is a good thing, as it lets applications pick whatever tradeoff suits their particular needs. However, this doesn’t mean that all options are equally worth exploring; trusting a committee with data availability while simultaneously trying to limit its power represents an odd and, we argue, ultimately fruitless spot on the design space. Arbitrum Sidechains are optimized to take full advantage of the benefits that a committee offers with no real material downsides over trusting the committee strictly for data availability. If you’re going to have your committee, you may as well use it too.