StarkDEX Deep Dive: the STARK Core Engine

Part 3 of 4

StarkWare
StarkWare

--

Please note: this blog post contains a lot of math notations. As not all mobile phones support them fully, we recommend you read it on desktop.

Introduction

This is the third part of our StarkDEX Deep dive where we describe the generic STARK proof system — of which the StarkDEX implementation is one instance. If you haven’t already done so and want to understand the full picture of StarkDEX, we recommend reading parts 1 and 2, but you can also read this post as a stand-alone. We also recommend you read our STARK Math series for a deeper discussion of the mathematical constructs required to understand this post.

In this post we describe the STARK proof system as an interactive protocol between two parties, the prover and the verifier. The prover sends a series of messages in an attempt to convince the verifier that a certain computation on some input was done correctly (the proven statement). The verifier responds to the messages with random values. Eventually, the verifier accepts the proof if the computation has been done correctly, or rejects it (with high probability) if this is not the case.

While we describe the system as an interactive protocol, it is noted that in the Alpha version of StarkDEX this interactivity is eventually replaced by a transformation to a non-interactive system wherein the prover provides a proof and the verifier decides whether to accept or reject it.

STARK Proof Systems

If you’d like to refresh your understanding of execution traces and polynomial constraints, take a look at our Arithmetization I & II posts from the Stark Math series.

The Trace

An execution trace of a computation, or the trace in short, is a sequence of machine states, one per clock cycle. If the computation requires w registers and lasts for N cycles, the execution trace can be represented by a table of size N x w. Given a statement regarding the correctness of a computation, the prover first computes a trace.

Denote the columns of the trace by f₁, f₂,…, fs. Each fⱼ is of length N=2ⁿ, for some fixed n. The values in the trace cells are elements in a big field F. In our Alpha version, F is a 252 bit prime field. The trace evaluation domain is defined as <g>, where g is a generator in a subgroup of the multiplicative group of F. Effectively, we enumerate the trace rows by the elements of <g>. Each trace column is interpreted as N pointwise evaluations of a polynomial of degree smaller than N over the trace evaluation domain. These polynomials are referred to as the trace column polynomials or column polynomials in short.

Trace Low Degree Extension and Commitment

As mentioned above, each trace column is viewed as N evaluations of a polynomial of degree less than N. In order to achieve a secure protocol, each such polynomial is evaluated over a larger domain, which we call the evaluation domain. For example, in our StarkDEX Alpha the size of the evaluation domain is usually 16*N. We refer to this evaluation as the trace Low Degree Extension (LDE) and the ratio between the size of the evaluation domain and the trace domain as the blowup factor (those familiar with coding theory notation will notice that the blowup factor is 1/rate and the LDE is in fact simply a Reed-Solomon code of the trace).

Following the generation of the trace LDE, the prover commits to it. Throughout the system, commitments are implemented by building a Merkle Tree over the series of field elements and sending the Merkle root to the verifier. The latter involves a few engineering considerations regarding field element serialization and hash function selection ignored here for the sake of clarity.

The leaves of the Merkle Tree are selected such that if a decommitment is likely to involve multiple field elements together, they are grouped into a single Merkle leaf. In the case of the trace LDE, this implies we group all field elements in the trace LDE “row” (the evaluation of all the column polynomials at a point gⁱ) into a single Merkle leaf.

The Constraints

An execution trace is valid if (1) certain boundary constraints hold and (2) each pair of consecutive states satisfies the constraints dictated by the computation. For example, if at time t the computation should add the contents of 1st and 2nd registers and place the result in the 3rd register, the relevant constraint would be Trace[t,1]+Trace[t,2]-Trace[t+1, 3] = 0.

The constraints are expressed as polynomials composed over the trace cells that are satisfied if and only if the computation is correct. Hence, they are referred to as the Algebraic Intermediate Representation (AIR) Polynomial Constraints on the trace. For example, in the context of proving correctness of execution of a transaction batch in a Decentralized Exchange (DEX), the constraints are such that they all hold if and only if a batch of transactions is valid.

Let’s examine some examples of different constraints over the trace cells.

  1. f₂(x)-a = 0 for x=g⁷ (the value in column 2 row g⁷ is equal to a).
  2. f₆²(x) -f₆(g³x) = 0 for all x (the squared value in each row in column 6 is equal to the value three rows ahead).
  3. f₃(x) -f₄(gx) = 0 for all x but g⁵ (the value of each row in column 3 is equal the value in column 4 one row ahead, in all rows but row g⁵).
  4. f₂(x) -2*f₅(x) = 0 for all x in an odd row (the value of each odd row in column 2 is equal to twice the value of the same odd row in column 5).

Recall that our goal is to prove the correctness of a computation. By writing a set of polynomial constraints which are satisfied if and only if the computation is valid, we reduce the original problem to proving that the polynomial constraints are satisfied.

From Polynomial Constraints to Low Degree Testing Problem

Next, we represent each constraint as a rational function. Constraint (1) above is translated into

This is a polynomial of degree at most deg(f₂)-1 if and only if constraint (1) holds (see Arithmetization 1 post for more details). To understand the conversion of constraints (2), (3) and (4), recall first that the following set {x∈ F| xᴺ-1=0} is exactly all x in the multiplicative group of F. Hence, we get the following reductions:

  • Constraint (2) holds if and only if the following rational function is a polynomial of degree at most 2*deg(f₆)-N:
  • Constraint (3) holds if and only if the following rational function is a polynomial of degree at most max{deg(f₃), deg(f₄)} — N+1:
  • Constraint (4) holds if and only if the following rational function is a polynomial of degree at most max{deg(f₂), deg(f₅)} — N/2:

Represented as rational functions, the constraints are such that each numerator defines a relevant rule needed to be enforced over the trace’s cells, and each denominator defines the domain in which the corresponding rule should hold. Since the verifier needs to evaluate these rational functions, it is important for the succinctness of the STARK protocol that the domains are such that their corresponding denominator can be evaluated efficiently, as in the four examples above.

Composition Polynomial

In order to efficiently prove the validity of the execution trace, we strive to achieve the following three goals:

  1. Compose the constraints on top of the trace polynomials to enforce them on the trace.
  2. Adjust the degrees of the constraints to a single degree so that they can be combined into a single polynomial of that degree.
  3. Combine the constraints into a single (larger) polynomial, called the Composition Polynomial, so a single low degree test can be used to attest to their (low) degree.

In this section we describe how the above is performed.

Degree Adjustment

In order to ensure soundness, we need to show that all individual constraints composed with the trace column polynomials are of low degree. For this reason, we adjust the degree of the constraints to the highest degree D of all the constraints (and if D isn’t a power of 2, up to the nearest power of 2 greater than D). As a result, this also becomes the degree of the Composition Polynomial.

Degree adjustment is performed as follows:

Given a constraint Cⱼ(x) of degree Dⱼ we add to the composition polynomial a term of the form

where 𝛼ⱼ and βⱼ are random field elements sent by the verifier. As a result, if the Composition Polynomial is of low degree it will automatically follow that the individual constraints are also of a low degree Dⱼ as desired. The result is a Composition Polynomial of the form:

where k is the number of constraints.

Combining the Constraints

Once the prover has committed to the trace LDE, the verifier provides random coefficients for creating a random linear combination of the constraints resulting in the Composition Polynomial. Instead of checking the consistency of each constraint individually over trace elements, it suffices to apply a low degree test to the Composition Polynomial which represents all the constraints applied at once to all the trace columns.

Since the constraints used in our Alpha version are of degree two or below, the degree of the Composition Polynomial is 2N. Hence, we can represent the Composition Polynomial H(x) as a single column of evaluations of length 2N or, as we prefer to do, as two columns H₁(x) and H₂(x) of length N, where H(x)=H₁(x²)+xH₂(x²).

Committing to the Composition Polynomial

Next, the prover performs yet another Low Degree Extension of the two Composition Polynomial columns representing H₁ and H₂. As these columns are of the same length N as the trace columns, we sometimes refer to them as the Composition Polynomial Trace and we address extending and committing to them in the same manner as with the execution trace. This is done by extending them by the same blowup factor, grouping the rows (of field element pairs) into leaves of a Merkle Tree, calculating the hash values and sending the root of the tree as the commitment.

Consistency Check on a Random Point (the DEEP Method)

Note that the value of H(x) for a given point (field element) z can be obtained in two ways: from H₁(z²) and H₂(z²) or by computing the above mentioned linear combination of constraints. This calculation induces a set of points over the trace columns that needs to be calculated in order to complete the required computation. The set of points required to calculate H(x) for a single point is called a mask. Hence, given a point x=z, we can check the consistency between H and the trace if we are given the values of H₁(z²) and H₂(z²) and the values of the induced mask on the trace.

At this phase, the verifier sends a randomly sampled point z∊F. The prover sends back the evaluations of H₁(z²) and H₂(z²), along with the evaluation of the relevant elements in the mask required for calculating H(z). Denote these set of values sent by the prover by {yⱼ}. For an honest prover the value of each yⱼ equals to fₖ(zgˢ) where k is the column of the corresponding cell and s is its row offset. The verifier may then calculate H(z) in two ways: based on H₁ and H₂ (using H(z)=H₁(z²)+z*H₂(z²)) and based on the mask values yⱼ. It verifies the two results are identical. It remains to show that the values sent by the prover in this phase are correct (i.e., indeed equal to the mask values of the point z), which will be done in the next section. This method of checking consistency between two polynomials by sampling a random point from a large domain is called Domain Extension for Eliminating Pretenders (DEEP). The interested reader can find more details in the DEEP-FRI paper.

DEEP Composition Polynomial

To verify that the DEEP values sent by the prover are correct, we create a second set of constraints and then translate it to a problem of low degree testing similarly to the composition polynomial defined above.

For each value yⱼ we define the following constraint:

Rational function is a polynomial of degree deg(fₖ)-1 if and only if fₖ(zgˢ) = yⱼ. Denote the size of the mask by M. The verifier samples M+2 random elements from the field {𝛼ⱼ}ⱼ . We refer to the polynomial that is the subject for low degree testing as the DEEP Composition Polynomial. It is defined as follows:

(note that in the first term k and zgˢ are of course not constants but dependent on j). This is a (random) linear combination of constraints of the form:

where f is either a trace column polynomial or H₁ / H₂ polynomial. Thus, proving that this linear combination is of low degree implies proving low degreeness of the trace column polynomials and that of the H₁ and H₂ polynomials, as well as that the DEEP values are correct.

Low Degree Testing

The FRI Protocol for Low Degree Testing

For low degree testing, we use an optimized variant of a protocol known as FRI (which stands for Fast Reed-Solomon Interactive Oracle Proof of Proximity), and the interested reader can read more about it here. The optimizations used are described in a separate section below.

As described in our blog post on the subject, the protocol consists of two phases: a commit phase and a query phase.

Commit Phase

In the basic FRI version, the prover splits the original DEEP Constraint Polynomial, denoted here as p₀(x) into two polynomials of degree less than d/2, g₀(x) and h₀(x), satisfying p₀(x)=g₀(x²)+x*h₀(x²). The verifier samples a random value 𝛼₀, sends it to the prover, and asks the prover to commit to the polynomial p₁(x)=g₀(x) + 𝛼₀*h₀(x). Note that p₁(x) is of degree less than d/2.

We then continue recursively by splitting p₁(x) into g₁(x) and h₁(x), then choosing a value 𝛼₁, constructing p₂(x) and so on. Each time, the degree of the polynomial is halved. Hence, after log(d) steps we are left with a constant polynomial, and the prover can simply send the constant value to the verifier.

As may be expected by the reader, all the above commitments on polynomial evaluations are made by using the Merkle commitment scheme as described above.

For the above protocol to work, we need the property that for every z in the domain L, it holds that -z is also in L. Moreover, the commitment on p₁(x) will not be over L but over L²={x²: x ∊ L}. Since we iteratively apply the FRI step, L² will also have to satisfy the {z, -z} property, and so on. These algebraic requirements are satisfied via our choice to use a multiplicative coset as our evaluation domain.

Query Phase

We now have to check that the prover did not cheat. The verifier samples a random z in L and queries p₀(z) and p₀(-z). These two values suffice to determine the values of g₀(z²) and h₀(z²), as can be seen by the following two linear equations in the two “variables” g₀(z²) and h₀(z²):

The verifier can solve this system of equations and deduce the values of g₀(z²) and h₀(z²). It follows that it can compute the value of p₁(z²) which is a linear combination of the two. Now the verifier queries p₁(z²) and makes sure that it is equal to the value computed above. This serves as an indication that the commitment to p₁(x), which was sent by the prover in the commit phase, is indeed the correct one. The verifier may continue, by querying p₁(-z²) (recall that

(-z²)∊ L² and that the commitment on p₁(x) was given on L²) and deduce from it p₂(z⁴).

The verifier continues in this way until it uses all these queries to finally deduce the value of P_h(z), for h=log(d). But, recall that P_h(z) is a constant polynomial whose constant value was sent by the prover in the commit phase, prior to choosing z. The verifier should check that the value sent by the prover is indeed equal to the value that the verifier computed from the queries to the previous functions.

All query responses received by the verifier also need to be checked for consistency with the Merkle commitments sent by the prover during the commit phase. Hence, the prover sends decommitment information (Merkle paths) together with these responses to allow the verifier to enforce this.

In addition, the verifier must also ensure consistency between the p0 polynomial and the original traces it was derived from (fⱼ and Hⱼ’s). For this, for one of the two queried values p₀(z) and p₀(-z) in the query phase, the prover also sends the values of fⱼ and Hⱼ induced by the DEEP Composition Polynomial together with their decommitments. Then, the verifier checks the consistency of these values with the commitments on the traces, calculates the value of p₀ and checks it consistency with the value p₀(z) sent by the prover.

In order to achieve the required soundness of the protocol, the query phase is repeated multiple times. For a blowup factor of 2ⁿ, each query roughly contributes n bits of security.

Transformation to Non-Interactive Protocol (Fiat-Shamir)

So far, the protocol described above is described as an interactive protocol. In our Alpha version we transform the interactive protocol into a non-interactive version, such that the prover generates a proof in the form of a file (or equivalent binary representation) and the verifier receives it to verify its correctness (on-chain).

The removal of interactivity is obtained via the BCS construction, a method that combines an Interactive Oracle Proof (IOP), such as the protocol described above, with a cryptographic hash function in order to obtain a cryptographic proof. For some background about this mechanism see our blog post.

The fundamental ideas behind this construction are that the prover simulates receiving the randomness from the verifier. In is done by means of extracting randomness from a hash function that is applied to prior data sent by the prover (and appended to the proof). Such a construction is frequently referred to as a hash chain. In our Alpha version, the hash chain is initialized by hashing the public input known to both prover and verifier. Further technical details are omitted for the sake of simplicity.

Proof Length Optimizations

In addition to the description of the protocol above, we employ several optimization techniques in order to reduce the proof size. These techniques are described in the following subsections.

FRI Layer Skipping

Instead of committing on each of the FRI layers in the commitment phase of the FRI protocol, the prover can skip layers and commit only on a subset of them. Doing that, the prover saves the creation of some Merkle trees, which means that in the query phase it has less decommitment paths to send to the verifier. There is a trade off, though. If, for example, the prover commits only on every third layer, in order to answer a query, it needs to decommit on 8 elements of the first layer (instead of the 2 it sends in the standard case). The prover takes this fact into consideration in the commitment phase. It packs together neighbour elements in each leaf of the Merkle tree. Thus, the cost of skipping layers is sending more field elements, but not more authentication paths.

FRI Last Layer

Another FRI optimization used to reduce the proof size is to terminate the FRI protocol earlier than when the last layer reaches a constant value. In such case, instead of having the prover send only the constant value of the last layer as a commitment, the prover sends the coefficients of the polynomial representing the last layer instead. This allows the verifier to complete the protocol as before, without the need for commitments (and sending decommitments for field elements in the smaller last layers). In our Alpha version, the FRI commit phase is typically terminated when it reaches a polynomial pⱼ of degree less than 64. The exact number may vary depending on the trace size.

Grinding

As mentioned above, every query adds a certain number of bits to the security (soundness) of the proof. However, it also implies sending more decommitments by the prover which increases proof size. One mechanism to reduce the need for many queries is to increase the cost of generating a false proof by a malicious prover. We achieve this by adding to the above protocol a requirement that following all the commitments made by the prover, it also finds a 256 bit nonce that hashed together with the state of the hash chain results in a predefined template of a required length. The length of the template defines a certain amount of work that the prover must perform before generating the randomness representing the queries. As a result, a malicious prover that attempts to generate favorable queries will need to repeat the grinding process every time that a commitment is changed. On the other hand, an honest prover, only needs to perform grinding once.

In our Alpha version we use a template with a certain pattern that contains the number of bits for which grinding has been performed and a fixed pattern. This is similar to the grinding performed on many blockchains. The nonce found by the prover is sent to the verifier as part of the proof and in turn the verifier checks its consistency with the state of the hash chain by running the hash function once.

In our Alpha version, the prover is required to perform 32 bits of grinding. As a result, the number of queries required to achieve a certain level of security can be reduced. With a blowup factor of 2⁴ this allows reducing the number of queries by 8.

The number of bits used for grinding is communicated to the verifier as a parameter. In case of inconsistency between the proof and this parameter, the verifier rejects the proof.

Periodic Columns

Many cryptographic primitives involve using some list constants. When we apply the same cryptographic primitive many times, we get a periodic list of constants. One option to use this list in an AIR is to add its values to the trace and provide constraints that will guarantee that the values are correct. However, this is wasteful as the verifier knows the list of values and thus computing LDE and committing to them does not make sense. Instead, we use a technique we refer to as Periodic Columns. The periodic structure of each such column leads to a column polynomial which can be represented succinctly. In the classic representation of a polynomial a₀+a₁*x+a₂*x²+…aₙ*xⁿ as a vector of its coefficient (a₀,…,aₖ) succinct representation means that most of the aⱼ’s are zeros. This enables the verifier to efficiently compute the pointwise evaluations of these polynomials.

Summary

This concludes the description of the generic STARK proof system as used by the StarkDEX system. The STARK proof system as described preserves the known benefits such as transparency (the elimination of a trusted setup), highly scalable verification and post-quantum security. On top of that, it includes several novel contributions and optimizations that reduce the proof size significantly and enhance the expressive power of efficiently provable statements in ways not previously described in the literature.

In the next, and last, post of this series we will describe how we built the AIR (Algebraic Intermediate Representation) for StarkDEX. As always, if you have any questions or comments, talk to us here or on twitter @StarkWareLTD.

Kineret Segal & Gideon Kaempfer

StarkWare

--

--