Proof System for VideoCoin Storage Mining

Published in

VideoCoin

8 min readMar 20, 2020

Introduction

Proof of Storage for VideoCoin is defined in “Section 7.1.1 Proof of Retrievability” of VideoCoin whitepaper[1]. We use merkle tree-based proof as described in the VideoCoin white paper, but also contains an enhancement to use zero-knowledge proofs based on zkSnarks. The prover (Storage/Transcode worker) generates and commits a proof based on the zkSnarks Merkel construction from pHashes of frames from the stored video segment. The validator verifies submitted proof, using it as input to the zkSnarks verification process along with public witness(challenge previously submitted by the client/publisher). Usage of zkSnarks helps in fast non-interactive and on-chain verification for the proof.

A typical HLS/DASH video segment of 10-second duration which contains 300 frames at 30fps can be represented using a merkle tree of depth 9. Each leaf of the merkle-tree contains the Pedersen hash of the pHash of a frame from the video segment. The method is illustrated using a simplified merkle tree with 4 leaf nodes in this blog. The actual implementation and performance evaluation is carried out using 300 leaf nodes that correspond to 10-second video segments.

We will present an in-depth overview of the proof-system along with a gentle introduction to zkSanrks concepts as and when required.

The above diagram shows a merkle tree with four-leaf nodes. Each leaf node is constructed using the pedersen hash of the pHash corresponding to the frame. A storage worker holding the video segment will be able to generate the full merkle tree. However, we will be using a zkSnarks proof system where the prover needs to supply the nodes shown with green color in the diagram along with the nodes shown in red color that constitutes the challenge from the verifier. The blue nodes shown in the diagram are constructed by the proof system and opaque to the prover.

The proof system, after a one-time initial setup, gives a CRS (Common reference string). The prover generates three elliptic curve points on BLS12–381 using the public and private inputs (shown in green and red colors in the above diagram). The verifier will update one of the curve points with public inputs (shown in red) and check for the bilinear relation between the points to accept the proof. The verification process is deterministic and takes 5 milliseconds on an i7–4core system and suitable for on-chain verification on a blockchain.

zkSnarks Circuit Construction

zKSnarks is a heavily researched topic for blockchain and a good amount of literature is available on the Web. A recent innovation by Zcash[2] enables the usage of an efficient hashing algorithm called Pedersen hashing in place of Sha256 for zkSnarks circuit construction.

Pedersen Hash Circuit

The above diagram illustrates how pHash of a frame is converted to a pedersen hash using a circuit. Each computational element forms a gate in the circuit. The above circuit is used for converting pHashes to pedersen hash on leaf nodes as well as generating pedersen hashes non-leaf nodes. The input bitstream is segmented into 3-bit windows and mapped to an elliptical curve point. The circuit also contains primitives for adding these points and finally maps a single point to the input bitstream which gives the hash. The elliptic curve being used is called Jubjub (twisted Edwards curve built over the BLS12–381 scalar field) and Zcash libraries implemented several primitives for this curve.

The zkSnarks circuit that we construct for pHash based storage proofs contains several thousand instances of the primitive circuit shown in the above diagram.

R1CS (Rank-1 Constraint System) and Solution Vector

The pedersen hashing circuits along with other primitives uses several variables that will be operated on by several computational elements. The circuit being formed can be represented as a set of multiplication gates and set of variables that are entering and leaving the computational gates. variables along with the inputs are treated as forming a solution vector. The solution vector comprises the complete state of the circuit. A gate performing a multiplication operation will have left and right operands and an output. The left and right can be thought of as formed by multiplying the solution vector with a vector of coefficients.

The following diagram shows the solution vector for the sample merkle tree that we are using

The vector consists of a set of public inputs (or challenge). The verifier uses these inputs while checking the correctness of the proof. The proof generated by the circuit which consists of three EC curve points out of which one EC point needs to be updated using the public inputs.

The inputs on the solution vector are the result of a computation performed by the prover. In the case of our merkle-tree example, these inputs consist of pedersen hash at leaf node(Ha) and an intermediate node(Hcd) along with public inputs Hb, Habcd and Path1, Path2 to specify the path from Hb to Habcd.

The zkSnarks circuit will generate other variables and also gates to compare if any of the generated variables should be equal to the inputs (which is the case of Habcd and Habcd’ that represent the root node of the merkle-tree).

Constraint System

The circuit contains several thousand multiplication gates and each gate is associated with a left and right coefficient vectors and output coefficient vector each with the size of the solution vector.

The following diagram illustrates these vectors at a gate, for example, gate ‘i’. The solution vector has size m and the coefficients of the left vector are designated as a1i, a2i … ami. Similarly, the coefficient of right input and output is shown as b1i, b2i,…bmi and c1i, c2i..cmi.

The operation at the gate “i” is represented as Si.Aj(i)* Si.Bj(i)= Si.Cj(i)

Where Si Aj(i) is aji, Bj(i) is bji and Cj(i) is cji in the above diagram.

Please note Aj, Bj, and Cj form vectors consisting of coefficients of variable j at all the gates.

QAP Representation

A constraint system described in the previous section forms the basis for QAP representation which is mapping boolean circuit to a set of polynomial equations over a field and pave way for PCPs(probabilistically checkable proofs).

The coefficient vectors formulated in the previous section can be organized in such a way that the coefficients of each variable of a solution vector form a polynomial of n-1th degree where n is the number of gates.

Similar polynomials are formulated for B and C coefficient vectors. Now the system of polynomials represents the circuit.

Setup and Key Generation

The set of polynomials formulated in the previous section can be used to obtain a coefficient vector at a random offset. In addition to this random offset(Tau) key generation also includes other randomness parameters (alpha, beta, gamma, and delta). This random offset forms part of toxic waste that needs to be destroyed.

A solution vector that satisfies a gate at the random offset is the basis for prover key and verification keys. However, this satisfiability is proved indirectly using polynomial algebra such that A * B — C is divisible by minimal (target) polynomial Z.

Further bilinear cryptography is used to map these coefficients to an EC curve point and proving key is a set of 3 EC curve points(G_a, G_b abd G_c) corresponding to coefficient vectors A, B, and C.

The verification key includes coefficients for multiplying public inputs and updating the EC curve point corresponding to A. It also includes coefficients used in the verification of bilinear mapping of EC curve points A, B, C.

Integration of Proof System with VideoCoin Network

The Storage proof system uses Elliptic curves that are not supported on current VideoCoin blockchain(Ethereum). Either we can customize current implementation or enable it in future versions (based on Cosmos)

The validator node supports on-chain verification of the zkSnarks storage proofs. The client that wants to out-source the storage, registers a request with the VideoCoin blockchain along with the reward in an escrow. A storage worker bids for the request and holds the file. The previous step happens outside of the proof-system. At the time of registration, the client also supplies a set of public input which consists of (a) a leaf node, (b) root node. At the end of the agreed period of storage renting, the storage worker submits proof of holding the file and submits to the validator node. The validator uses zkSnark verification and transfers escrow funds on successful verification of the proof.

Storage renting can be divided into multiple epochs. Each epch may need a new set of public-witness(challenge) to prevent a replay of previous proofs and any cached pHashes. The pHash caching can be prevented by specifying pHash generation from a region of the frame and signaling region as part of public-witness.

VideoCoin Validator

Maintains a database of storage requests from the VideoCoin publishers along with public-witness(challenge) to verify the proof.

Provides on-chain verification of storage proofs and transfers associated rewards from escrow.

VideoCoin Storage Worker

It provides storage for video segments. Generates proofs using the CRS and proving keys provided by the network.

VideoCoin Client (Publisher)

Outsource storage of a file and deposit funds into escrow. It also submits public-witness(challenge) for verification of the proof submitted by the storage worker.

Performance

tinyRAM vs Bellman implementation

Our previous implementation of proof-of-transcode using zkSnarks/tinyRAM approach has huge performance bottlenecks. The following table shows the comparison of implementations of SSIM circuit with tinyRAM and Zcash Bellman. The number of constraints and variables of the circuit influences the proof-generation time. We see a 90 times reduction in the constraints and 70 times reduction in the variables.

Performance of Storage Proofs

The storage proof system leverages the recent optimizations in the Zcash/Bellman library implementations. In addition, we tightly integrate pHash generation with ffmpeg scaling algorithms to avoid data transfer overheads and create a high-performance pipeline between different modules of the proof system.

References:

1. VideoCoin — A Decentralized Video Encoding, Storage, and Content Distribution Network

https://storage.googleapis.com/videocoin-preico/VideoCoin-Whitepaper.pdf

2. Bellman: zk-SNARKs in Rust

https://electriccoin.co/blog/bellman-zksnarks-in-rust/