Demystifying ZKML

16 min readNov 27, 2023

Technical introduction to ZKML — all you need to know to get started

Introduction

As you may already be aware, zero-knowledge proofs (ZKPs) stand out as one of the most captivating and widely discussed technologies within the Web3 space. This emphasis is well-founded, given that one of Ethereum’s foremost challenges revolves around scalability—the capacity to handle numerous transactions without an exponential rise in fees. At present, the most effective strategy involves leveraging Layer 2 (L2) ZK rollups. These rollups, powered by the remarkable capabilities of ZKPs, execute transactions off-chain and only post the validity proofs on the Layer 1 (L1). A key attribute enabling this process is the ability of ZK proofs to demonstrate that a computation has been performed correctly at a significantly faster pace than actually executing the computation itself.

Additionally, another important feature of ZKPs is “privacy” — the ability to prove something, such as knowledge of some data, without revealing the actual underlying data. This privacy feature is successfully utilized in several decentralized identity protocols, such as Polygon ID, and solutions for private transactions, like the Aztec Network.

It appears that both features are not only important in the realm of blockchain and Web3 but can also offer solutions to various challenges in another field currently undergoing an even more intense hype cycle than crypto ever did — artificial intelligence (AI) and its subfield, machine learning (ML). This blog post will delve into the intersection of these technologies, exploring the concept of zero-knowledge machine learning (ZKML).

While there is a considerable amount of material available on the combination of zero-knowledge proofs (ZKPs) and machine learning (ML) (ZKP + ML = ZKML) (intro to ZKML, demo project zk-MNIST, awesome zkml repo, and ZKML — Devcon 2022 presentation), the topic can still be somewhat confusing for beginners. The purpose of this blog post is to provide a step-by-step introduction to ZKML, using perhaps the simplest example: a neural network with just one neuron.

Current state, ZKML, EZKL …

ZKP not only bolsters machine learning (ML) with enhanced verifiability but also introduces a layer of privacy. As previously mentioned, this amalgamation falls under the umbrella of zero-knowledge machine learning (ZKML), sometimes referred to as verifiable ML or validity ML. One of the common assertions that ZKML can address is proving that a particular output was obtained through the utilization of a public neural network (NN) with some private input data.

Even if the input were shared or public, the value persists, as ZK proofs can be verified faster than executing the entire inference process. It’s worth noting that, currently, ZKML is predominantly focused on inference, as the training phase is prohibitively computationally expensive.

This blog post won’t delve extensively into the general introduction to ZKML, including the mathematical aspects (for those interested, you can find ample material here since there are numerous excellent resources available on this topic. The provided links above and throughout the article offer a wealth of information for those looking to explore further.

Zero-knowledge proofs (ZKPs) and the tools/frameworks employed to construct ZK circuits play a pivotal role in the broader evolution of cryptography, ushering in the era of programmable cryptography. For those keen on exploring recent advancements in the entire cryptography field, there’s a wealth of video material available from the recent PROGCRYPTO conference (Devconnect 23). It provides valuable insights into the latest developments shaping the landscape of cryptography.

In the realm of ZKML, one of the most active projects is EZKL (pronounced “Ezekiel,” a tidbit I discovered while crafting this blog post). Embarking on development over a year ago, EZKL aims to extend the capabilities of the Ethereum Virtual Machine (EVM) and enable the execution of more resource-intensive computations on-chain at an economical cost. Currently, its primary focus lies in the domain of machine learning (ML) and deep learning (DL) models.

EZKL serves as a powerful tool, allowing users to articulate their computation programs, such as ML models, using popular libraries like PyTorch and TensorFlow. The tool then generates ZK circuits for these programs. ZK circuits essentially represent the computational flow that one aims to prove. Notably, EZKL is implemented in the Rust programming language. However, the tool offers bindings for various programming languages, with robust support for Python and JavaScript. This Python support is particularly meaningful, given that the majority of ML/DL code, especially model definitions, is commonly written in Python.

EZKL leverages Halo2 as its underlying proof system, a formidable member of the zk-SNARK (zero-knowledge succinct non-interactive argument of knowledge) family of proof systems. Within its codebase, EZKL seamlessly interfaces with the Halo2 library. While originally developed by the Zcash team, Halo2 is now maintained by various entities through numerous forks. Notably, EZKL is predominantly maintained by two dedicated authors, namely dante and Jason Morton, both of whom exhibit remarkable commitment and activity in advancing the project. Kudos to their efforts!

Throughout the following sections, we’ll delve into several nuances of the EZKL tool as we explore an example. For those eager to dive deeper, a wealth of information can be found on their GitHub repository, along with well-crafted documentation.

This blog post exclusively centers around EZKL, which, in my opinion, stands out as the most developed and stable tool for ZKML at present. However, it’s important to acknowledge that several other projects and teams are actively working on ZKML tools. Some notable mentions include Giza, Modulus Labs (check their blog posts here), and Worldcoin.

Starting simple

In the upcoming sections, we’ll examine one of the simplest examples of a neural network (NN) — a single layer with a single neuron. The primary objective of this blog is to explore how EZKL transforms this NN into a ZKP representation, often referred to as a circuit. We’ll also delve into the process of performing proving and verifying — specifically, confirming that a specific input was utilized to produce a particular output using this model or NN.

It’s important to note that this article assumes some prior knowledge of ZKPs and Halo2. Feel free to refresh your understanding or start from scratch by referring to relevant material: Halo2 tutorial, Halo2 — Fibonacci Circuit demo video, blog post about technical details of Halo2 proving system, and ZK Whiteboard Sessions.

While our main focus will be on the conversion of the model, we’ll touch upon other steps of the ZKML workflow as they come into play. However, in-depth explanations will be provided only for steps that are specific to the machine learning example. It’s worth noting that certain steps are common across all ZKP programs/circuits.

Our journey begins with the transformation of the model, defined in popular high-level language frameworks like PyTorch and TensorFlow, into the circuit representation. This transformation follows specific standards and utilizes relevant libraries.

PyTorch, ONNX, tract

Upon starting your journey with EZKL, a notable feature is its acceptance of machine learning (ML) models in the ONNX (Open Neural Network Exchange) format. ONNX stands as a well-known and widely adopted format in the AI/ML community. It standardizes the representation of the structure and architecture of NNs, and, interestingly, it extends its support to various other ML models.

Currently supporting 100+ operators, ONNX continually evolves with the addition of more operators through versioning. However, it’s worth noting that, in some cases, certain NN architectures might face challenges in being saved in the ONNX format. Given its popularity and broad usage, the majority of ML frameworks facilitate the export of models to the ONNX format.

Here is the PyTorch code example showing how to export our model to the ONNX:

import torch
from torch import nn

model = nn.Linear(1, 1) # this is our simple model
x = torch.randn(1, 1)

model.eval()
model.to('cpu')

torch.onnx.export(model, # model
                  x, # model input
                  "network.onnx", # location to store model in ONNX
                  export_params=True, # should include the neural network parameters
                  opset_version=10, # the ONNX version
                  do_constant_folding=True, # should execute constant folding for optimization
                  input_names=['input'], # model's input name
                  output_names=['output'], # model's output name
                  dynamic_axes={'input': {0: 'batch_size'}, 
                                'output': {0: 'batch_size'}}) # variable length axes

Models in the ONNX format can be visualized to some degree using various programs. The simplest approach is to utilize netron.app, which is available online for easy access. This provides a visual representation of the model:

On the right side of the app interface, you have the option to inspect the values of the trainable parameters of the NN, including bias values. Understanding these values will prove beneficial as we later delve into examining the circuit representation.

As indicated in the image above, the parameter/weight multiplied with the input is 0.30608, and the bias value added to the product is 0.05692. Given that we have a single layer with one neuron, this encapsulates the entirety of the neural network’s functionality.

Once the model is in the ONNX format, the next step is to transition to EZKL and commence the ZKP magic. An initial action with EZKL is to display a table of the model, showcasing how EZKL comprehends it. You can achieve this using the following command (this blog post will predominantly use EZKL’s CLI commands; for additional information, refer to this link).

ezkl table -M network.onnx

The output of the command is the following table:

The table provides a wealth of information, and let’s systematically dissect it. There are five rows, each representing a specific step in the computation of the NN:

Row 0: Represents the input to the neural network.
Row 1: Corresponds to a constant value, representing the trainable parameters.
Row 2: Illustrates the multiplication operation between the input and the parameter/weight. This operation is executed using einsum (Einstein Summation). Additional information on einsum can be found here and here.
Row 3: Another constant value, representing a bias.
Row 4: Signifies the addition of the product (from Row 2) and the bias value (from Row 3).

As we’ll explore later, we can also conceptualize the circuit as matrices, representing computation traces of what we aim to prove. For a deeper dive into this topic, you might find this article insightful.

It’s important to note that the opkind (operators) mentioned in the table above are not strictly ONNX operators but rather tract operators. EZKL opted to work with tract instead of ONNX because the tract library condenses the ONNX model/graph into compositions of a smaller set of operations. This approach streamlines the implementation of circuit constraints (more info on constraints here), requiring constraints implementation for only around 20 operators compared to the 100+ operators present in ONNX. For a more detailed explanation, you can refer to this link.

Additional columns present in the table include:

out_scale: This column accounts for the need to scale between float values in models and field elements used in ZKP systems like Halo2. Further information can be found here.
inputs: Represents the inputs.
out_dims: Specifies the output dimensions.
required_lookups: It’s possible to pre-compute certain elements, storing input/output values in the table. This trade-off involves using more memory to reduce CPU usage. Further insights on this aspect can be explored here.

Circuit

With a model now prepared in ONNX format and a preliminary understanding of how EZKL interprets the model, we are poised to delve into ZKP related tasks. The next steps involve working on ZKP-related components and generating the required files, setting the stage for subsequent proving and verifying processes.

Our initial step involves generating settings for the circuit. Before we can seamlessly convert the ONNX model to the ZKP representation (circuit), it’s imperative to create the circuit configuration. This can be accomplished using the following command:

ezkl gen-settings -M network.onnx

The command output yields a JSON file containing settings, accessible here. This file encompasses crucial configuration details about the circuit, including the number of constraints (5 in this example, mirroring the number of rows in the table). It also includes information about input data shape, required lookup tables, input and output scales, and visibility information.

Visibility choices are particularly noteworthy, as they directly influence the circuit structure. You can designate visibility for input data, output data, and model weights/parameters with various available levels: private, public, encrypted, hashed, fixed, and KZG commit (check source code). Additional insights on visibility can be found here. We’ll adhere to the default setting for this example — private input, private model parameters, and public output. This settings file will be instrumental in subsequent steps of the workflow.

Subsequently, we need a structured reference string (SRS), a crucial component required for most zk-SNARK schemes. The SRS is created through a trusted setup procedure, and more information on this concept is available here. The size of this string depends on the model (hence, the settings are passed to the command) and is universal, meaning the same SRS can be employed for different models.

There are two options for the SRS: generating your own locally or downloading one. The downloaded string is the outcome of a secure trusted setup, specifically the powers-of-tau ceremony hosted by the Ethereum Foundation Privacy & Scaling Explorations (PSE) group. This process instills confidence in its reliability and safety for use in production. While generating our own SRS is conceivable, it would require proving that the setup was executed correctly and securely, making it a more intricate process.

ezkl get-srs -S settings.json

Following this, we are prepared to compile the ONNX model into a format conducive for later use in creating the circuit. Notably, we utilize the trace Rust log level, allowing us to extract more information and gain insight into the processes transpiring behind the scenes:

RUST_LOG=trace ezkl compile-circuit -M network.onnx -S settings.json --compiled-circuit network.ezkl

Having compiled the model into network.ezkl, you might be curious about the details of the process. Let’s delve into the logs to unravel what transpired!

In the image provided, we gain a rough understanding of how values from specific rows are mapped to rows in the new format. For instance, in row 2, we observe the sequence 0/0 > 1/0 > -> > 4/0: this signifies that input data from row 0 and neuron weight from row 1 are utilized in row 2. The output of einsum in this row will subsequently be employed in row 4. Notably, the actual values in the rows (found in the last column) are familiar, mirroring the visualization of the ONNX model. Additionally, factors previously mentioned in the table and settings (such as scales) are also applied during the compilation process.

However, the real treasure trove of information awaits us in the next step of the workflow — the setup phase. It’s crucial to note that this phase should not be confused with the trusted setup; the trusted setup, which involved generating the structured reference string (SRS), has already been completed, albeit not by us. The setup phase, on the other hand, promises to reveal even more insights as we proceed to create the circuit:

RUST_LOG=trace ezkl setup -M network.ezkl --srs-path=kzg.srs --vk-path=vk.key --pk-path=pk.key

Following the setup phase, we now possess the circuit, which will play a pivotal role in both proof creation and verification. The logs provide valuable insights, and one of the initial observations can be gleaned from the image below:

As highlighted earlier, circuits can be conceptualized as matrices, each comprising different types of columns. In the context of Halo2 (further details available here), circuits are structured with four distinct column types:

advice: Encompasses private input and intermediate data.
instance: Comprises public data.
fixed: Houses constants and lookup tables used in the circuit.
selector: Involves booleans that selectively activate constraints.

The composition of our circuit reveals three advice columns and one instance column. The rationale behind having three advice columns is rooted in the fact that, at most, three cells within a single row (such as row 2 with einsum) are filled with private values — two inputs (input data and neuron weight) and one output (product). Conversely, we only require one instance column, as our aim is to make the model output public (a single output in row 4).

The quantity of fixed and selector columns is contingent on the types of operators employed within the model. For a deeper understanding, I recommend exploring the source code.

Further insights can be gleaned from the logs of this command, and one noteworthy aspect includes the setup of a lookup table:

In the process of laying out the model into a circuit — remember, ONNX is a computational graph model — the logs illustrate the presence of nodes. They also shed light on how precisely the input and output data of different rows or steps are mapped together:

More logs when layouting ML model to circuit

As depicted in the images above, a clearer representation emerges, illustrating how input and output sizes/dimensions, scales, row offsets, and other parameters seamlessly align and interconnect.

For a more visual understanding of the circuit, the Halo2 framework provides two visualization options. The first approach involves rendering the computation matrix, achievable through the following commands:

cargo install --locked --path . # don't forget to install with the feature render
ezkl render-circuit -M network.onnx -O render.png

In the image above, columns of distinct colors offer a visual representation, where each color signifies a specific type: pink for advice columns, white for instance columns, and purple/blue for fixed columns. The green areas, on the other hand, denote regions (further information on regions is available here, and EZKL’s perspective can be explored here) within the circuit.

Additionally, the dark horizontal line indicates that not all rows are utilized in this circuit (only five in this case), yet the number of rows in the circuit always adheres to a power of 2.

The second method for presenting the circuit involves a dot graph. However, due to the specific layout utilized by EZKL — a single region layout for everything (more details here), the dot graph rendering cannot be effectively executed. The concept of a single region is also evident from the last image.

The generated or created circuit is stored within two output files created by the setup command: vk.key and pk.key. These represent the verification key and proving key, respectively. The verification key is employed for verifying proofs, while the proving key is utilized in the creation of proofs. Within these keys, the circuit and its structure — including columns, lookup arguments, and more — are encoded. It’s essential to note that visibility levels (input, output, weights) play a crucial role in key generation and are determined based on which data is public and private. For instance, if public weights are used, these weights are fixed at setup time and can be extracted from the proving key (further details here). Keys also encompass encoded constraint information, covering fixed and selector columns.

The generated keys are specific to our circuit, which means if we change something in the circuit, we need to repeat the setup process and create a new pair of keys.

Proving and verifying

We have now reached the point where we can create ZKML proofs. All the steps described thus far need to be executed only once, and once we possess the proving and verification keys, they can be used as many times as needed (bearing in mind that the keys are specific to a particular ML architecture/model).

On the proving side, the initial step involves generating a witness (further information available here), encompassing the inputs, intermediate steps, and outputs that will be proven. This involves running the actual input data through a quantized model and performing any necessary transformations:

RUST_LOG=trace ezkl gen-witness -M network.ezkl -D input.json

The witness for this straightforward example can be found here. The accompanying image illustrates the process of passing the input through the model:

With the witness in hand, we can now proceed to generate the ZK proof:

ezkl prove -M network.ezkl --witness witness.json --pk-path=pk.key --proof-path=model.proof --srs-path=kzg.srs

The resulting proof is stored in the model.proof file and can be shared with the verifying party. Verification of the proof can be accomplished using the following command:

ezkl verify --proof-path=model.proof --settings-path=settings.json --vk-path=vk.key --srs-path=kzg.srs

Proof successfully verified :)

Voilà! The proof has been successfully verified, establishing that a specific output (public to the verifier) was indeed derived from certain private input data running through the designated model. While we conducted the proof verification directly in the CLI, it’s worth noting that EZKL also supports on-chain verification on EVM, adding an intriguing layer to the process. For more details on generating smart contracts capable of performing proof verification, you can explore here. But that wraps up this article; we’ll save some excitement for the next time!

Here’s also a visual overview of all the steps:

Visual overview of all steps of the ZKML workflow

If you find that some steps were not explained in sufficient detail or if you’re eager to explore more about EZKL, be sure to visit their thoroughly detailed documentation page.

Final thoughts

This article turned out a little longer than expected, but I believe having a comprehensive first tutorial will prove helpful for getting started with ZKML and EZKL.

ZKML is emerging as one of the hottest and fastest-evolving fields in the ZK space. Through iterative advancements in frameworks and libraries, we are progressing towards generating proofs for a broader and more diverse set of models in a more efficient manner (both in time and space). In a world increasingly influenced by automated and AI-driven decision-making, the verifiability of models (and putting them on-chain) is becoming increasingly crucial. It’s inspiring to witness numerous dedicated researchers and coders working on providing the necessary tools and solutions.

I hope this article has illuminated some of the hidden details behind the EZKL library as you embark on your ZKML journey. If you come across any mistakes, have questions, or would like to contribute to this article, please reach out to me on Twitter/X or Farcaster.

Exciting times lie ahead!

Acknowledgments

Thanks to the EZKL team for developing everything and providing feedback on the blog post, and my coworkers Muhamed Turkanovic and Saso Karakatic for reading and reviewing all of this.