I don’t see how it’s plausible for parallel forks of the hash chain to be finalized concurrently. Concurrent blocks is not the same Proof-of-Approval which selects a winner block per slot from all candidates while slashing equivocation. There must be some subsequent block that warrants the concurrent blocks didn’t contain any non-equivocation conflicts (i.e. w.r.t. any new approved validators nor conflicts in any of the transactions). And likewise when there are conflicting transactions, to warrant which of the conflicting transactions is the valid one or to declare both conflicting transactions null because how can a validator choose between two valid transactions that conflict. But then these warrants can also be lies which must be slashed (or never allowed to admitted as a block?) if another block posts proof of the lie about a conflict. Yet we never know a concurrent block is final until we’ve checked all possible blocks of an epoch for all possible non-equivocation conflicts, but there is no way to ever prove we’ve seen all blocks from an epoch because this would require trusting a centralized oracle (that declares when is finality) or being able to look forward infinitely into the future for all possible blocks that claim to be from that same epoch. Because you’re allowing for validators to vote more than once concurrently (which is not how 100% final Byzantine agreement works) then relying on the ability to penalize them for equivocation, but this requires a total order which is incompatible with the unbounded set of concurrent blocks.
Deterministic, bounded (not necessarily synchronous because Byteball and Hashgraph have 100% finality and 100% asynchronous) Byzantine agreement relies on consensus being finalized for each epoch before moving to the next one (so that there is some safety about a consensus reference point before moving forward). This is what guarantees the safety. If instead you allow an unbounded number of consensus agreements within the same epoch, then there’s no overall consensus (i.e. no reference point) with which to declare finality and thus there’s no safety.
The specific mistake in your math and formal model appears to be that you form the union of all such agreements and treat them as a holistic unit mathematically for computing the safety. See my link below for the detailed explanation of that error in your modeling. Whereas, for example AFAICT nondeterministic, fully asynchronous BFT consensus systems require the limiting structure of a DAG or other equivalent reference point in order to achieve 100% probability of finality (within the liveness and safety BFT thresholds). David Mazieres’ SCIP consensus system algorithm designed for Stellar apparently employs the structure of trust reputation as a reference point to achieve 100% probability of finality (within the liveness and safety BFT thresholds).
Also it’s been difficult for a sporadic observer to track the iterations of the Casper/Slasher research process (and also the proliferation of names of versions and subprojects for Ethereum), so perhaps I’ve missed something? I remember in the past there was supposed to be betting involved, but I didn’t see that mentioned in this blog.
EDIT: the concurrent block production could be employed analogously to the §D. Optional Trust-but-Verify Validation in OmniLedger as temporary lower-latency confirmation that awaits the higher-latency single-point-of-truth total order confirmation reference point for each epoch. This would require adding a master control proof-of-work or proof-of-stake layer analogous to OmniLedger, which validates the concurrent blocks and checks for the aforementioned conflicts. But note that would forsake scaling because the concurrent blocks wouldn’t be trusted shards. Instead it would be superior for scaling to implement the entire OmniLedger protocol so that each validator set for each shard is trusted and assigned securely randomized validators. OmniLedger is equivalent² to the design that I completed the conceptualization of by “early 2016”, although it has the following significant drawbacks compared to the improved design I had written down in a draft white paper by November 2016 (a private archive is stored at GitXXX for future proof):
- The minimum latency for OmniLedger will be several seconds for acceptable error rates at the current network presumptions in the decentralized case.
I’m working on an improved design (which doesn’t employ proof-of-work and isn’t purely proof-of-stake) that obtains instantaneous sub-second latency (~0.05 seconds) for achieving the sort of interactivity we normally experience when interacting with a centralized database such as Facebook. Whereas, the DPoS-variant in EOS also barely obtains 0.5 second latency which requires forsaking anti-fragility and decentralization (and note that distributed isn’t the same as decentralized).
- OmniLedger periodically randomizes the validator set for each shard which creates some weakened security arguments (which plausibly may open a yet unknown vulnerability) and increases the costs.
- The liveness of OmniLedger is only valid up to 33% adversarial threshold (unlike the 100% threshold for Bitcoin’s Nakamoto proof-of-work) and is so much more perverse above 25% that it’s not charted. Thus the confirmations of consensus can stalled with less than 33% of the proof-of-work hashrate or proof-of-stake stake!
Nakamoto’s proof-of-work can be censored by an attacker possessing greater than 50% of the systemic hashrate, which mines blocks with no transactions and orphans all blocks from other miners. For both Nakamoto and the Byzcoin proof-of-work variant of OmniLedger, the attacker would incur proof-of-work costs while sustaining such a liveness attack yet wouldn’t collect any transaction fee income. However those costs would likely decline as the difficulty would plummet as the minority miners stop mining due to the bankruptcy of all their blocks being orphaned (or in the ~33–50% adversarial hashrate scenario of OmniLedger then all transactions and transaction fees stalled but no proof-of-work blocks orphaned). Presumably the attacker could short-sell the ledger’s token while sustaining the attack for a profit. Thus such an attack is significantly more viable for an attacker of the OmniLedger than for Nakatomo proof-of-work.
- OmniLedger is not anti-fragile against censorship, highly adaptive attackers, and lacks thorough formally reasoning on the possible economic game theory vulnerabilities (i.e. lacks complete vetting).
Attackers of PBFT shard consensus may have more to gain than they lose from the proof-of-work or proof-of-stake required to gain enrollment as a validator, especially due to externalities which can even render any security deposits impotent.
I have added some more such vetting below in an informal exposition.
- Employing the proof-of-work variant for the master control of OmniLedger (being that OmniLedger is based on the transaction fee sharing economic, and game theory model of Byzcoin) is incentives incompatible with decentralized consensus of the single longest chain. I had linked to some explanation of that allegation in 2016. Those who don’t defect (as described in that linked explanation) from the transaction fee sharing protocol of Byzcoin, will earn less profit than those who do. Thus the only Nash Equilibrium is for all miners to defect (for those who don’t will be bankrupted). So instead of converging on a single longest chain, Byzcoin (and consequently OmniLedger) will diverge into the chaos of failing to converge on a single longest chain which will force an oligarchy of mining to be in control so as to prevent such devolution. But such an oligarchy would be winner-take-all centralization, and would extract the maximum rents (e.g. in the form of maximizing transaction fees) that the market will bear. Yet this winner-take-all centralization which is what Byzcoin attempts but fails to prevent is always the case for any proof-of-work system (c.f. also) even without the OmniLedger sharding features.
Additionally my understanding is that the Byzcoin proof-of-work variant for the master control of OmniLedger would either require choosing a limited number of validators (and thus a maximum limit on number of shards thus limited scaling) and/or the loss of Nash Equilibrium (aka incentives compatibility) because of validators rarely being invested in having participated in generating a recent block solution. This limitation is distinct from the limitation on shard size in Elastico. Also the underlying Byzcoin model seems to be incompatible with pool mining which as you know is required for minimizing individual miners’ variance costs and maximizing miners’ economies-of-scale (such as for validation). Thus the decentralization of Byzcoin mining would be minimal to none even at the inception. Or at least the combination of OmniLedger with Byzcoin appears to be incompatible with pools because the set of shard validators wouldn’t be the miners in the pools but rather the entire pool acting as a centralized entity.
- Employing a proof-of-stake variant (derived from Ouroboros and Algorand) for the master control of OmniLedger has the weaknesses shared by all proof-of-stake systems that could ever be conceived. That is either there’s a nothing-at-stake problem which economically forces an oligarchy to fill the consensus power vacuum, e.g. DPoS. Otherwise a majority (or is it ⅔?) of all stakeholders must remain online always in the non-delegated Ouroboros, combined with the synchrony of the network must remain within the chosen bounded thresholds of the nodes of the stakeholders.
Conceptualizing the flaw in Vitalik et al’s proposed slashing model for concurrent block production (excluding what I wrote about OmniLedger) at the generative essence abstraction, 100% probability of finality (as opposed to asymptotic probabilistic¹) finality is an approximation of reality and thus requires that epochs be (an approximation/illusion of) bounded to all externalities. Unbounded participation (by concurrent unstructured blocks and/or validators) is not compatible with 100% final BFT and only works with probabilistic, asymptotic finality such as proof-of-work. I wrote, “Parallelism is merely the requirement that non-instantaneous and non-ordered communication (of results) is acceptable.”
“if my Bitcoin client were to decide on the block 7 confirmations deep, it would have consensus safety with your client, if our clients used similar criteria to make decisions (with very high probability in a synchronous network with less than some amount of Byzantine hashrate).”
None of this slasher design makes any sense whatsoever. It’s like it has been designed by people who have incomplete or jumbled understanding of the subject matter. I’m shocked. I wasn’t expecting the flaws to be so obvious. Your formal methods must be misapplied. I’m sorry but this design appears to me to be inept, unless I have missed some important aspect of the design. Thus is appears me that you guys really are not qualified to be doing this. Because after 3+ years of trying to design a proof-of-stake variant for Ethereum, I think it’s time for you to admit you’re not capable.
If I have some mistake in my analysis, please enlighten me. I’m doing peer review that I should have done last year, but I was too ill in 2017.
Being as shocked as I am that after years of working on this slasher design there could be such egregious errors in the design, I am scratching my head to try to understand how could that be? I’m cautioned that I should have some error in my conceptualization because surely these guys could not have all arrived at such errors? I know Vitalik is somewhat talented with math (probably more so than myself because I’m so many decades removed from employing higher math regularly although I do dig back in occasionally with some effort). And he writes intriguing blogs. He ostensibly possesses abstract thinking and high IQ. Presumably ditto Vlad and other guys working on this. Byzantine fault tolerant consensus algorithms are strange beasts. Perhaps the problem here may be that attempting to model BFT consensus mathematically such as with process calculi is an incomplete perspective and attempting to achieve radical breakthrough in the field of BFT may be out of the research of some young guys who aren’t established experts in that field. Even formal methods such as process calculi are limited to what they model. IOW, process models are nearly always incomplete, as they must be because how can a process model incorporate that universal entropy trends to maximum in an open thermodynamic system. Even I read one of the commentators on one of Vitalik’s blogs remarked that his math degree wasn’t helping him wrap his mind around these blogs. I think partly that is because Vitalik is lacking a holistic understanding. Thus his blogs on the slashing design have not been up to par with his past blogs on conceptual issues that he had better wrapped his mind fully around. I have read some astute insights in the past from Vitalik. So I’m allowing for the possibility that I have misunderstood or that I have some error. But I doubt that. I await Vitalik’s replies.
Plausible liveness basically means that “it should not be possible for the algorithm to get ‘stuck’ and not be able to finalize anything at all”.
I’m sorry but my analysis concludes that is incorrect.
¹ Which AFAICT you erroneously conflated with “synchronous”. Bitcoin has probabilistic (and idealistically but dubious asymptotic) finality which enables it to be partially asynchronous because it only requires the bounded synchrony of the block period. Whereas, 100% probability of finality requires either a synchronous network or some other structure such as a DAG (both with a bounded set of validators) due to the famous FLP impossibility theorem. Andrew Poelstra explains this in footnote ¹ on pg. 1 of his paper on proof-of-stake.
² I had realized that to prevent shard takeover, I needed to periodically randomize the set of validators for each shard, but I hadn’t followed through on developing the randomization algorithm because I had realized some of the other flaws so I abandoned the design idea. Also I did have the essentially the exact same cross-shard algorithm as the Atomix algorithm in OmniLedger. That’s the logical way of accomplishing cross-shard transactions in a design that prevents shard takeover. In my early 2016 conceptualization, I didn’t incorporate the Ouroboros proof-of-stake variant for the master (aka root) consensus layer.