The goal of the Livepeer network is to drastically drive down video transcoding costs by allowing users to outsource video transcoding to service providers in a competitive and open marketplace.
A challenge with outsourcing video transcoding to an untrusted service provider is verifying that the video transcoding was done correctly. The latest research on tackling this challenge in the Livepeer network uses a machine learning classifier to distinguish between correctly and incorrectly transcoded video. However, there are other approaches that are worthwhile investigating as well.
Before analyzing video transcoding verification schemes, we can consider a simpler, related domain: image authentication (IA), the process of determining whether an image faithfully represents an original photo captured by a physical device . A part of video transcoding verification is video authentication (VA) — a transcoded video must faithfully represent the original video. Since a video can be modeled as a successive set of images, image authentication schemes may be useful primitives in a video authentication scheme.
In , Naveh and Tomer provide an overview of existing IA schemes and present PhotoProof, an IA scheme based on succinct cryptographic proofs that guarantee the image authenticity. In the rest of this post I will provide a review of the PhotoProof paper that will cover:
- The different types of IA schemes.
- How the PhotoProof IA scheme works.
This post only provides a high level overview of the PhotoProof IA paper. For more technical details, I encourage readers to check out the actual paper itself which includes security proofs as well as notes on implementation choices made for the prototype.
Types of IA Schemes
An IA scheme allows a prover to convince a verifier that an image is the result of successively applying a set of permissible transformations on an input image captured by a physical device. An application using an IA scheme may allow an image to be rotated, resized or cropped, but disallow other modifications to the image’s content.
The above rotation transformation is permissible so the output image is authentic.
The above cut & paste transformation is not permissible so the output image is not authentic.
The above rotation transformation is permissible, but the cut & paste transformation is not permissible so the output image is not authentic.
In their paper, Naveh and Tomer describe the following types of IA schemes:
These schemes involve embedding an invisible watermark (i.e. a pattern or signal) in an image that will be preserved if a permissible transformation is applied to the image and destroyed if a non-permissible transformation is applied to the image. While the watermark would not be observable by the human eye, the verifier would be able to check an image for the watermark to determine if any non-permissible transformations were applied.
These schemes involve special image hash functions that produce collisions for different images that have very similar content. These hash functions are based on features extracted from images. Suppose a verifier wants to determine if image B is the result of applying a permissible transformation on image A. The verifier would compute the hash digest of both images and if the digests are equal or close enough (i.e. the difference between the two digests is less than some threshold) the verifier would accept image B as authentic.
Naveh and Tomer highlight the following downsides of semi-fragile watermarking and robust hashing IA schemes:
- Fixed set of permissible transformations. These schemes typically only support a fixed set of permissible transformations. If a scheme does not support a particular transformation then a completely different scheme needs to be considered since these schemes are not extensible.
- Non-negligible error probabilities. These schemes rely on statistical verification involving the comparison of a measured value against a pre-defined threshold. As a result, there is always non-negligible error probability which can lead to false positives (classifying a tampered image as authentic) and false negatives (classifying an authentic image as tampered).
- Vulnerable to adaptive attackers. These schemes rely on a verification method that is known to an attacker (i.e. the attacker knows how a watermark is constructed or how a robust image hash is computed). Since verification is statistical, it is possible for an attacker to construct a tampered image that is accepted by the verifier as authentic by constructing a very large set of tampered images and testing them using the known verification method.
- Lack of succinct verification. The verification time/cost in these schemes increases with the size of the image and the number of transformations applied to the image. In many applications, the goal is for verification to be less computationally costly than applying the transformations (if this is not the case than the verifier might as well apply the transformations by itself).
Naveh and Tomer introduce a new type of IA scheme based on cryptographic proofs and present PhotoProof as an instantiation of this type of scheme. PhotoProof has the following properties that address the downsides of previously defined IA schemes:
- Supports new transformations as long as the transformation can be implemented in an arithmetic circuit (see the next section for more details on arithmetic circuits).
- Has negligible error probability because proofs are unforgeable such that no adversary can convince a verifier that a tampered image is authentic. As a result, false positives and false negatives are impossible.
- Not vulnerable to adaptive attackers because while an attacker knows the verification method, it will never be able to produce a proof for a tampered image since proofs are unforgeable.
- Verification is succinct because proofs are constant size meaning that verification can be substantially faster and less costly than applying the transformations.
The PhotoProof IA Scheme
PhotoProof is based on the proof-carrying data (PCD) paradigm introduced by Chiesa and Tomer in . A PCD system models distributed computation run by a set of untrusted parties as a directed acyclic graph (DAG) which represents the transcript of the computation. Each node in the transcript represents an untrusted party, incoming edges represent input messages and input proofs and outgoing edges represent output messages and output proofs.
A PCD system uses a compliance predicate to enforce specific properties of the distributed computation. The entire PCD computation is compliant (i.e. it exhibits a specific property) if all output messages of nodes in the transcript that perform a step of the computation are compliant. The output proof a node attaches to its output message not only certifies that the output message is compliant, but also certifies all previous messages in the transcript are compliant. As a result, a verifier only needs to verify the output proof of the final message in a transcript in order to determine if the entire computation is compliant.
In the PhotoProof PCD construction, a message contains an image and a public key. The proof attached to a message is either a digital signature or a PCD proof that certifies that the image is authentic. The compliance predicate handles the following two cases:
- The case where there is no input message and the output message is sent with a digital signature.
- The case where there is an input message and the output message is sent with a PCD proof.
In the first case, the compliance predicate enforces the following property:
- The digital signature must be valid for the output image and public key.
The first step of the PCD computation is for a camera to capture an image and sign its hash with its secret signing key. The enforced property in the first case ensures that only images with valid signatures will be accepted by the compliance predicate.
In the second case, the compliance predicate enforces the following properties:
- The output public key is the same as the input public key.
- The input transformation T is in the set of permissible transformations for the system (this set is defined when the system is initialized).
- The output image is the result of applying transformation T on the input image.
By enforcing that the output public key is the same as the input public key, the original public key used for digital signature verification in the first step of computation can be passed to the verifier. Then, the verifier can confirm that the public key is valid (i.e. by checking that it is an authorized public key).
The PhotoProof PCD construction is implemented using zkSNARKs and compliance predicates are represented using arithmetic circuits, the mathematical representation used in a zkSNARK scheme. Given a compliance predicate represented as an arithmetic circuit, a zkSNARK proof can be generated that certifies that the compliance predicate is satisfied by the input message, input proof and output message of a node in a PCD transcript. Additionally, recursive composition of zkSNARKs allows a verifier to verify that an entire computation is compliant with only the output proof of the final message in the transcript. The utility of recursive proofs for IA can be illustrated with the following example:
- Suppose that applying transformation T with image A as input yields image B and proof P.
- Suppose that applying transformation T’ with image B and proof P as input yields image C and proof P’.
- If the proof P’ is recursive, then a verifier can verify that image C is authentic using image B and proof P’ without access to image A or proof P.
Consider the following scenario:
- Alice captures a photo with her camera and wants to outsource image transformation to Bob and Eve who will then send the edited image to Charlie for viewing.
- The IA scheme only permits rotation and resizing transformations.
- Alice’s camera captures an image
Aand the camera's secret signing key produces a signature
A_sigover the image
- Alice sends
- Bob executes the proving algorithm with the inputs image
A, the signature
A_sig, the rotation transformation
T_rotateand the public proving key.
- Bob sends the output of the proving algorithm, the rotated image
Band the proof
B_proof, to Eve.
- Eve executes the proving algorithm with the inputs image
B, the proof
B_proof, the resizing transformation
T_resizeand the public proving key.
- Eve sends the output of the proving algorithm, the resized image
Cand the proof
C_proof, to Charlie.
- Charlie executes the verification algorithm with the inputs image
C, the proof
C_proofand the public verification key.
- Charlie accepts image
Cas authentic if the verification algorithm returns true.
A few of the challenges that make PhotoProof difficult to use in practice are:
- Performance. For transformations using a 128x128 image, the average verification time for the PhotoProof prototype is .5 seconds, but the average proving time is 5.1 minutes.
- Circuit design. The size of real world images and the nature of operations required for image computations (see below for a description of one such type of operation) make the implementation of image computations in an arithmetic circuit challenging.
- Floating point math. Many image computations (such as rotation) involve floating point math which is not possible in an arithmetic circuit where all operations are over a finite field. The workaround used in PhotoProof is to use fixed point math both inside the circuit and outside the circuit and allow for small rounding errors in the checks performed inside the circuit. The need to use fixed point math outside the circuit means that the transformation implementation needs to be modified and proofs cannot be generated for images produced from the original transformation implementation.
Interesting Areas of Future Research
Some interesting areas of future research include:
- More efficient zkSNARK schemes. The PhotoProof PCD construction uses the GGPR13  zkSNARK scheme. Newer zkSNARK schemes such as Groth16 may improve PhotoProof performance .
- Hardware accelerated zkSNARK proof generation. As noted in the paper, running the prover algorithm on a GPU, FPGA or ASIC may result in more reasonable proving times for real world applications.
- Multiple compliance predicates. The PhotoProof compliance predicate considers the entire set of permissible transformations. As a result, the size of the circuit and the proving time is dependent on the size of the set of permissible transformations. An alternative is to create multiple compliance predicates, one for each of permissible transformation. The benefit of this approach is that the size of the circuit and the proving time would be dependent on the number of permissible transformations applied instead of the number of possible permissible transformations.
- Video transformation circuits. As mentioned at the beginning of this post, IA schemes may be useful primitives for VA schemes.  describes some initial work in considering how to adapt PhotoProof to be used for video authentication. Video transformations that may be good candidates for initial investigation include resizing (an IA scheme that supports image resize transformations may be able to serve as a basis for a VA scheme that supports video resize transformations) and framerate adjustment. More complex transformations to investigate include decoding and encoding with different codec implementations.
Naveh and Tomer’s PhotoProof paper introduced a new cryptographic proof based IA scheme that offers many advantages over previous IA schemes based on semi-fragile watermarking and robust hashing. While more work is required in order to make the construction described in the paper practical for real world use cases, the foundation established by the construction opens up a variety of interesting research paths for IA and VA schemes.
If image and/or video authentication schemes interest you:
- We are looking to hire a video security research engineer to research and implement these types of schemes for the Livepeer network.
- Join the Livepeer Discord and forum to chat about these topics.