STARK Math: The Journey Begins
StarkWare’s mission is to bring STARKs to the real world. This is the first in a series of posts explaining the theory behind STARKs and our implementation of it. We start lightly and will ratchet up the math in subsequent posts.
Computational Integrity (CI) is a fundamental property that underlies commerce. In simple terms, it means that the output of a certain computation is correct. CI is what allows us to trust an account balance presented to us, or the bill at a store. This post will dive into how permissionless blockchains achieve CI without requiring trust, the dramatic price they pay for this in terms of scalability and privacy, and how STARKs can save the day.
II. Trust vs. Verification
“Old World”: Trust, or Delegated Accountability
Financial systems (banks, brokers, exchanges, etc.) need to operate with integrity to serve their societal functions. What mechanisms incentivize them to operate with integrity? The “old world” assumes trust as a proxy for integrity. We trust banks, pension funds, etc., to operate honestly. Let’s go down the rabbit hole and examine the basis for this trust, ignoring the “integrity theater” — the tall buildings and fancy suits — set to impress us. From a purely rational and utilitarian perspective, the thing that prevents the financial system from seizing all our funds is the threat of social disgrace, jail, and fines. There’s also a carrot — reputation, which attracts future customers and generates future profits. By signing on financial statements with their names, people in the “old world” stake their personal freedom, existing and future finances as a collateral for integrity, and we, the public, base our trust on this arrangement. The verification of this integrity is delegated to experts like accountants, auditors, and regulators. We will call this Delegated Accountability. It’s not a bad system: it’s been serving modern economies faithfully for quite some time.
A new variant of the “old world” approach is the Trusted Execution Environment (TEE). A trusted hardware manufacturer (like Intel) produces a physical machine (like the SGX chip) that cannot deviate from the specified computation and signs correct states using a secret key known only to that physical machine. Integrity is now based on trust in the hardware and its manufacturer and on the assumption that it is impossible to extract secret keys from such physical devices.
“New World”: Verify, or Inclusive Accountability
Blockchains offer a more direct way to reach integrity, captured by the motto “Don’t Trust, Verify”. This “new world” does not require an integrity theater, it doesn’t rely on accountants, nor do its developers and network maintainers stake their personal freedom to gain public trust. Integrity is guaranteed by Inclusive Accountability: a node with a standard computational setup (a web-connected laptop) should be able to verify the integrity of all transactions in the system.
The prevalent method to verify CI in permissionless blockchains is via naive replay: all nodes are asked to re-execute (replay) the computations that verify each and every transaction. Inclusive Accountability, in this naive form, leads to two immediate challenges:
- Privacy: If everyone gets to inspect all transactions then privacy might be compromised. The absence of privacy deters businesses, as it means sensitive information may not remain proprietary. It also deters individuals, as it erodes human dignity.
- Scalability: Demanding that the system be accountable to a standard laptop means it cannot scale up by simply moving to bigger computers and larger bandwidth. This leads to a severe bound on the throughput of the system.
Proofs systems (discussed next) are an excellent solution to both challenges. Zero Knowledge (ZK) proof systems are by now an established tool to address privacy in blockchains and explained excellently in several posts of Zcash [1, 2, 3]. Although ZK-STARKs offer Zero Knowledge, this post will not discuss it but focus on Scalability and Transparency (the S and T in STARK).
III. Proof Systems
Proof Systems started with the introduction of the Interactive Proof (IP) model by Goldwasser, Micali, and Rackoff in 1985. Interactive proofs are protocols that involve two kinds of entities: a prover and a verifier, who interact over a number of rounds by sending messages. The prover and verifier have conflicting objectives: the prover wants to convince the verifier of the integrity of a certain computation, and the verifier is a suspicious gatekeeper entrusted by the public with the task of distinguishing between truisms and falsities. The prover and verifier communicate interactively, taking turns in sending messages to one another. These messages depend on the statement being proved, on prior messages, and may also use some randomness. On the prover side, randomness is used to achieve zero knowledge and on the verifier side randomness is needed to generate queries to the prover. At the end of the interactive process the verifier outputs a decision, to either accept the new state or reject it.
A good analogy is the examination process practiced in a court of law when one party submits a claim and its counterparty questions its validity. For the claim to be accepted as true, the answers provided by the claimant (prover) to the examiner’s (verifier’s) queries must be consistent and valid. The examination process is expected to expose any mismatch between a statement and reality, and thus expose it as false.
We say that a proof system solves CI if when updating the system from state A to state B, the following properties hold:
- Completeness: If the prover indeed knows how to change the state from A to B in a valid way then the prover will manage to convince the verifier to accept the change.
- Soundness: If the prover doesn’t know how to change the state from A to B, then the verifier will notice an inconsistency in the interaction and reject the suggested state transition. There remains a tiny false-positive probability, i.e., a probability of the verifier accepting an invalid proof. This probability is a system security parameter which can be set to an acceptable level like 1/(2¹²⁸), similar odds to winning the powerball five times in a row.
This pair of properties has a crucial implication to the principle of Inclusive Accountability discussed earlier. The verifier can accept the state transition suggested by the prover without making any assumptions about the integrity of the prover. In fact, the prover can run on faulty hardware, it can be closed source and it can be executed on a computer controlled by a malicious entity. The only thing that matters¹ is that the messages sent by the prover lead the verifier to accept the statement. If that is the case, we know that computational integrity holds.
By now there are quite a few theoretical constructions of proof systems, along with implementations. Some are deployed in cryptocurrencies, like the SNARKs used by Zerocash/Zcash, and Bulletproofs (BP) deployed in Monero. (For general information on proof systems go here.) What distinguishes STARKs is the combination of the following three properties: scalability (the S in STARK), transparency (the T in STARK), and lean cryptography.
Scalability — Exponential Speedup of Verification
Scalability means that two efficiency properties hold simultaneously:
- Scalable Prover: The prover’s running time is “nearly-linear” in the time it would take a trusted computer to check CI by just re-executing the computation themselves and checking that the result matches what someone is claiming. The ratio of “overhead” (time needed to generate a proof / time needed to just run the computation) remains reasonably low.
- Scalable Verifier: The verifier’s running time is polynomial in the logarithm of naive replay time. In other words, the verifier’s runtime is exponentially smaller than simply replaying the computation (recall that ‘replay’ is the current blockchain method to achieve Inclusive Accountability).
Apply this notion of scalability to a blockchain. Instead of the current mode of verification by naive replay, imagine how things will look when a blockchain moves to verification by using proof systems. Instead of simply sending the transactions to be added to the blockchain, a prover node will need to generate a proof but thanks to the Scalable Prover its running time is nearly-linear in the running time of the naive replay solution. And the Scalable Verifier will benefit from an exponential decrease in its verification time. Furthermore, as blockchain throughput scales up, most of the effect will be shouldered by the prover nodes (which could run on dedicated hardware, like miners), whereas the verifiers, which would constitute most of the nodes in the network, would hardly be affected.
Let’s consider a concrete hypothetical example, assuming verifier time (in milliseconds) scales like the square of the logarithm of the number of transactions (tx). Suppose we start with 10,000 tx/block. Then the verifier’s running time is
VTime = (log₂ 10,000)² ~ (13.2)² ~ 177 ms
Now increase the blocksize a hundredfold (to 1,000,000 tx/block). The new running time of the verifier is
VTime = (log₂ 1,000,000)² ~ 20² ~ 400 ms
In words, a 100x increase in transaction throughput led only to a 2.25x increase in the verifier’s running time!
In some cases, the verifier will still need to download and verify availability of data, which is a linear-time process, but downloading data is generally much cheaper and faster than checking its validity.
Transparency: with Trust toward None, with Integrity for All
Transparency means² there is no trusted setup — there is no use of secrets in the setting up of the system. Transparency offers many benefits. It eliminates the parameter setup generation procedure which constitutes a single point of failure. The lack of a trusted setup allows even powerful entities — big corporations, monopolies and governments, which control the “old world” financial system — to prove CI and gain public acceptance of their claims because there’s no known way to forge STARK proofs of falsities, even by the most powerful of entities. On a more tactical level, it makes it much easier to deploy new tools and infrastructure and change existing ones without a need for elaborate parameter-generation ceremonies. Most importantly, transparency aligns well with the “new world” that demands Inclusive Accountability under no trust assumptions. To paraphrase Abraham Lincoln, transparent systems allow to operate with trust toward none, with integrity for all.
Lean Cryptography: Secure & Fast
STARK has minimal³ cryptographic assumptions underlying its security: the existence of secure cryptographic and collision-resistant hash functions. Many of these primitives exist today as hardware instructions, and the lean cryptography leads to two more benefits:
- Post-Quantum Security: STARKs are plausibly secure against efficient quantum computers.
- Concrete Efficiency: For a given computation, the STARK prover is at least 10x faster than both the SNARK and Bulletproofs prover. The STARK verifier is at least 2x faster than the SNARK verifier and more than 10x faster than the Bulletproof verifier. As StarkWare continues to optimize STARKs these ratios will likely improve. However, a STARK proof length is ~100x larger than the corresponding SNARK and ~20x larger than BulletProofs.
We started by explaining the “new world” of permissionless blockchains, in which the motto is “Don’t Trust, Verify”. The principle of Inclusive Accountability demands that the integrity of the financial system be verifiable readily by anyone, as opposed to the Delegated Accountability of the “old world”. To allow blockchains to scale we need methods that allow for verification of computational integrity that are faster than naive replay.
STARKs are a special kind of a proof system that offers scalability and transparency, and are based on lean cryptography. Their scalability and transparency allow inexpensive and trustless verification of the CI. This is the promise of STARKs and the mission of StarkWare.
Our next post will go a bit deeper into the mathematics of constructing STARKs.
Thanks to Vitalik Buterin and Arthur Breitman for reviewing drafts of this blog post.
Michael Riabzev & Eli Ben-Sasson
¹Privacy preservation (ZK) does make demands on the prover code, to ensure prover messages do not leak information about the witness via side channels. But soundness requires no trust assumptions.
²Formally, a transparent proof system in one in which all verifier messages are public random strings. Such systems are also known as Arthur-Merlin protocols.
³This minimality of cryptographic assumptions holds for interactive STARKs (iSTARKs). Noninteractive STARKs (nSTARKs) require the Fiat-Shamir heuristic which is a different beast.