Living on the edge: Zero Knowledge Human-assisted Machine Learning with Git and Bitcoin Part 2

Lloyd E.
Ring-0
Published in
4 min readDec 29, 2022

In part 1, I described the problem that arose while building a “brain” computer for my autonomous vehicle conversion, and describe some reasons behind the decision for my solution. By offloading some of the computations required for the brain in a decentralized manner, we can achieve, or even exceed, the real-time compute needs for our autonomous system.

Parallel Distributed Computing

The ability to compute multiple things in parallel is often seen as a key characteristic of intelligence, and is something that many artificial intelligence systems strive to replicate. Things such as recognizing lane lines, traffic signs, or driver awareness all need to be done in parallel, with some tasks being more important than others. Here, we start with a simple task: recognizing traffic signs and lights. In this case, we train a model with sample data of signs and labels for each sign (e.g. stop sign, speed limit, etc). In our initial setup, the “brain” computer is fed the training data (often pre-trained but not always), and the model runs against this local data. In our new distributed system, the training and implementation computations are done elsewhere, with the result from the computation relayed to the main computer or other edge device.
My approach to this distributed compute system challenge is to use Zero Knowledge Proofs ( or ZKP for short) that run on the bitcoin protocol.

Bitcoin as a Computer

The primary issue that arises with distributed computing is the need to maintain data privacy. If not addressed at the core, a malicious actor could intercept or spoof the output and cause harm that, in our case, would be kinetic and harmful to our self-driving vehicle. Bitcoin has a simple, yet robust privacy model. Bitcoin, or Peer-to-Peer electronic cash, achieves its privacy by creating a true peer environment.

In a traditional computer, you have, in most cases, a “trusted space” model, such as the protection ring architecture. Protection Ring’s architecture are used in computer systems to separate different types of code and data, and to prevent code at lower privilege levels from accessing or modifying code or data at higher privilege levels. “Ring 0” is typically reserved for the operating system kernel, which is the core of the operating system that manages all of the hardware and software resources of the computer. Code running at ring 0 has the highest level of privilege, and can access all of the system’s hardware and memory. This trusted space is well guarded, but can be compromised, and cause catastrophic damage.

On the contrary, a peer-to-peer model is “trust-less” and relies on peer identity. Peer identity is verified using secure cryptographic keys via a public blockchain, which is a decentralized and distributed database that records transactions and information. This makes it difficult for users to cheat or manipulate the system, and helps to ensure that transactions are fair and transparent. In our case of building a machine learning training system, the “trustless” computation model is exactly what we want. Because the mining process involves the parallel computation of multiple transactions and blocks, Bitcoin can be thought of as a large parallel computer system. This decentralized and distributed model allows the Bitcoin network to operate without the need for a central authority or intermediary, and helps to ensure the security and integrity of the system.

ZKPs on Bitcoin

Zero-Knowledge Proofs (ZKPs) are a type of cryptographic technique that allow one party (the prover) to prove to another party (the verifier) that a statement is true, without revealing any additional information. ZKPs are often used to protect privacy, as they allow the prover to reveal only the minimum amount of information necessary to prove the truth of a statement.

In the context of using on-chain machine learning (ML), ZKPs can be a useful tool for retaining privacy. ML algorithms often require the use of sensitive data, such as personal information or confidential business data, in order to learn and make predictions. By using ZKPs, the data used for ML can be hidden off-chain, while still allowing the ML inference to be proven correct.

This makes ZKPs a natural fit for retaining privacy when using on-chain ML, as they allow the ML inference to be proven without revealing the sensitive data used to train the model. This can be especially useful in scenarios where the privacy of the data is a concern, or where the data is subject to strict regulations or compliance requirements.

ZKP + ML + Bitcoin

The UTXO (short for Unspent Transaction Output) model in use by bitcoin is a nice fit for the data structure needs required in a real-time system, and gives us the flexibility to implement relational, graph or any other data scheme we choose. We want to ensure the lowest possible latency on compute cycles, and Bitcoin Script language is a powerful light weight low-level language perfect for our system. The combination of bitcoin scrypt with ZK-Snarks provides us a very strong, and well-typed stack for application development. In the example code described here, we see we are able to make private both the input data or the models themselves.

In Part 3 we will describe and build our new ZKP decentralized computer system and use Git to deploy a sample ML model to edge machines, and how we can incentives others with electronic cash to help label and train our models.

References

Bitcoin: A Peer-to-Peer Electronic Cash System

Running Deep Neural Networks on Bitcoin — https://xiaohuiliu.medium.com/running-deep-neural-networks-on-bitcoin-b8f48eddce8e

Create Your First Zero-Knowledge Proof Program on Bitcoin — https://xiaohuiliu.medium.com/create-your-first-zero-knowledge-proof-program-on-bitcoin-ec159cc501f4

Inline Script inside sCrypt — https://xiaohuiliu.medium.com/inline-script-inside-scrypt-27d5aa279fd3

--

--