Plonk with GPU acceleration

0xHusen
2 min readSep 2, 2021

TL;DR: a Plonk GPU version, which achieves at lease 6X performance of the CPU version.

The zero knowledge proof algorithm Plonk is potentially the one for Layer2 rollup in Ethereum and privacy protection in Blockchain. It has unique combination of features, such as: one trusted setup for all later-on applications; good trade-off between decentralization/transparency and efficiency compared with Groth16 and zkSTARK; batch verification, especially useful for expensive Layer1 verification such as Ethereum.

Therefore, Plonk has been adopted by one of two largest Layer2 infrastructure project such Matter-labs Zksync and other smaller projects such as iden3 Circom. With the release of more user-friendly Layer2 language and robust ZkEVM from Zksync and other teams such as Hermez, we see an explosion of dapps. Compared with Optimistic rollup such as Arbitrum, which requires modification of EVM into AVM to support both efficient execution and proving, zkp-based Layer2 solution is much safer(no need for real-time monitoring of execution) and faster withdraw.

However, there is indeed one remaining problem for Plonk adoption: the complicated and costly proving process.

Plonk is much more complicated compared with Groth16, for Polynomial operations, while the commitments are basically the same. The algorithm is divided into 5 rounds. Taking the most expensive round 3 as an example:

Plonk round3 t(X) computation

It’s complex because of several reasons:

  • operations within in shifted set, which costs 4x memory and computation, for details, please take a look at the simpler explanation in coinlist groth16 competition.
  • many pre-computations, such as q_M,q_R,q_O, S_sigma1, S_sigma1… If you consider a large circuit of size 2²⁶, one poly costs 2GB, and here it costs 4x2GB=8GB.
  • large fft for the result, which is 4x the size other poly operations

Zksync bellman did an excellent work by parallelizing almost all possible operations in CPUs and dropping/reusing large values. However, we found it requires too many CPUs, which is too expensive to prove for individuals and not decentralized enough, because it can only deployed on Amazon AWS.

So we developed a Plonk GPU version, which achieves at lease 6X performance of the CPU version.

The other reason to develop this are:

  • GPU has very different and unique architecture compared with CPU, such as much more cores with relatively lower clock speed.
  • Nvidia will keep providing surprising products which follows Moores’s law better than Intel for deep learning/gaming, etc, Blockchain can benefit from that.
  • There is one remaining technology or secret weapon we values. We will wait for it when available.

If anyone is interested, please feel free to contact me or email me: wanghs.thu@gmail.com .

--

--