Cheetah: A STARK-friendly elliptic curve
Originally published at https://toposware.com/blog/cheetah-curve.
For Topos’s trustless and privacy-preserving cross-subnet communication, Toposware requires subnets to prove the validity of their internal state transitions to the rest of the ecosystem. For this purpose, and to maintain privacy of the subnets, we rely on a FRI-based zk-STARK proving system to enforce correct state transitions.
General-purpose state transition systems (the so-called zk-VMs / zk-EVMs projects emerging in the blockchain ecosystem) usually require, among other checks, to verify a digital signature (Schnorr, ECDSA, …) ensuring the valid origin of funds. This requirement originally oriented us to base our construction on the curve over the prime field Fp with p=2²⁵¹+17×2¹⁹²+1 designed by STARKWARE¹. However, the Topos state-transition STARK statement design was missing a crucial, fundamental difference between regular zk-SNARKs and the FRI-based zk-STARK we were built on. The latter construction does not rely on any algebraic group consideration. This property specifically allows one to build a proving system over a particularly small² and performant field, unlike existing SNARKs. Those usually rely on the hardness of the Elliptic Curve Discrete Logarithm Problem (ECDLP) on transparent or pairing-friendly curves, and hence inherently work over much larger fields.
Only the Schnorr signature portion of our state-transition program was relying on this notion of an algebraic group, which led to using this large prime field Fp. However, it had the heavy cost of imposing every single value handled (even bits!) to be internally represented as a large 252-bit integer, and to rely on the associated complex modular arithmetic, which was an unacceptable burden for future development. To put this into perspective, internal representation of Fp elements was consisting of four
u64limbs in little-endian order, representing a given element in Montgomery form. A single Schoolbook multiplication between two elements was then requiring 16
u128multiplications and 32
u128additions before Montgomery reduction, which was also requiring 16
u128additions³ and 4
u64multiplications. Multiplying two arbitrary field elements inside our AIR program (even binary values!) was involving that many arithmetic operations, a prohibitive overhead!
We then came up with what was going to bring us nice speed-ups both natively, outside our STARK AIR program, as well as inside our prover computations, by changing the algebraic group used in Schnorr to work over an extension of a small prime field.
The result of this work is a new curve, named Cheetah, whose construction approach and design are discussed below. A more thorough analysis of the construction is publicly available at https://eprint.iacr.org/2022/277.
There were two major considerations to evaluate before moving on to the next steps:
- a prime field size, which had to be small enough for efficient field arithmetic and STARK computations;
- an extension degree, large enough to prevent attacks targeting field-extension based curves, and small enough to have efficient group operations.
For the first point, a natural size choice was around 64 bits as we can easily find primes pp of this size with large 2-adicity for efficient FFTs⁴, and still inherit fast native arithmetic, unlike operations over much larger prime fields. While we were originally going for a 63-bit prime with extremely large 2-adicity (55 to be precise!), we decided in the end to go for the so-called Goldilocks prime p=2⁶⁴−2³²+1 used in various projects, for its size and efficient modular reduction. In addition, the prime being greater than 2⁶³, this allows compatibility with the Cairo framework. The reduction of 2-adicity on the other hand was not considered critical, as execution traces in practice would have less than 2³² steps for most applications.
As for determining the optimal extension degree k over which we would build our new elliptic curve, the decision was not so straightforward. We already knew that 4 would be a lower bound on our extension degree, as that meant having a group size of ≈256 bits, similar to regular curve constructions, yielding a Pollard-Rho security level of around 128 bits (give or take some bits). However, in the context of field-extension based curves, other attacks leveraging the specific structure of the field construction to require fewer operations than Pollard-Rho have to be considered and analyzed. Quintic extensions (Fp⁵) had received limited attention, which was one of our concerns for security confidence and adoption. We then decided to go for a degree 6 extension, yielding a base field Fp⁶ of a size equivalent to widely-used pairing-friendly constructions (like BLS12–381), and having received more consequent scrutiny from the academic world.
Having these two criteria in mind, we could start to extensively look for curves resistant to sextic-extension based attacks, in addition to the regular elliptic curve attacks on ECDLP (Pollard-Rho, MOV attack, …) until we found a curve over Fp⁶ with p=2⁶⁴−2³²+1, Fp⁶ being defined as
Fp[X]/(X⁶−7). The curve equation is E: y² = x³+x+(u+395), with u⁶ = 7. It contains a 255-bit subgroup of prime order q = 55610362957290864006699123731285679659474893560816383126640993521607086746831. We named this curve Cheetah 😼 and implemented it here. The search algorithm, to reproduce the search or generate new curves on sextic extensions is available here.
The transition to the new Cheetah curve yielded an appreciable improvement on the Schnorr signing and verification algorithms, mostly due to the performance difference⁵ of the new instantiation of the Rescue-Prime hash over the small new field Fp.
In addition, AIR programs are now running much faster, thanks to the fast new STARK field arithmetic! In particular, the verifier is now running under 10ms even for complex programs, a crucial milestone for the scalability of the Topos protocol.
The Cheetah upgrade granted us a very welcome speed-up both for native and in-circuit computations. But more importantly, removing the performance bottleneck of the underlying STARK field allows us to leverage the full power of our FRI-based zk-STARK, for current and future applications on the Topos ecosystem. And even more optimizations are going to be integrated into the Cheetah implementation pretty soon, stay tuned!
¹ For STARKs, we require the underlying field to have a large 2-adicity (i.e. p=(2^n)t) such that we have a primitive n-th root of unity for efficient multi-point evaluation/interpolation over the domain generated by this root of unity through FFT algorithms. We usually aim to have n≥32.
² With a cryptographer’s eye, a small prime here means p≪2²⁵⁶.
³ We make no distinction between addition and subtraction. A more accurate description would be 52
u128additions and 4
⁴ Fast-Fourier Transforms can actually be evaluated over any field, even without a root of unity of order 2^n for some n, following the approach detailed in this paper, at the cost of an extra logarithmic overhead, which has been judged impractical in our use-case, and hence is not approached here.
⁵ Recall that, while being extremely efficient inside a SNARK circuit / STARK AIR program, algebraic hash functions are less efficient natively than their usual binary counterparts like SHA3 or Blake3, hence Rescue-Prime is taking a non-negligible portion of the signature algorithms running times.