Chapter 1: How to Put Your AI On-Chain

Modulus Labs
Coinmonks
Published in
6 min readJul 29, 2022

--

Despite the enthusiasm of countless people everywhere, the task of running a sizable AI model on-chain continues to illicit sighs of disappointment. After all, the simple need for compute has historically rendered the two technologies so fundamentally incompatible — like Avatar in 3D and not-vomiting — that even contemplating the idea is enough to make any Solidity developer sweat buckets of Gwei.

State of on-chain AI circa 2022

Yet things are changing quickly.

For one, pioneering rollup technologies are poised to dramatically increase transaction speed and compute capacity on Ethereum, all while lowering gas prices and preserving privacy. And despite every major crypto market folding in half, the greater crypto dev community has rallied behind the “build market.” As important technical foundations continue to be laid down at a regular clip, and web3 humbly expands its offerings, one can be forgiven for asking: perhaps now?

We know, we know — sure, “wHaT If wE pUt thE AI On tHe BlOcKchaIN?” sounds like something a seven year-old would dream up. But then again, in our experience, seven year-olds can be surprisingly wise.

That’s why we’ll be laying out our thoughts on the future of on-chain AI. We’ll start with some inflections we see in the Ethereum L2 rollup space (the evolving architecture of the major players here translate to hugely important opportunities for “true” on-chain AI). Then, with a good impression of how compute may change in the crypto landscape, we’ll advance a path to a future of powerful, verifiable, and transparent AI operating on Ethereum. As we explore this space, we’ll also be labeling our assumptions (that’s what the italicized text signals), before revisiting them at the end of our discussion.

Alrighty, let out your inner seven year old and let’s begin!

Step 1: Roll-up Rollup

Ethereum has already scaled. Just ask the astonishingly talented folks at Starkware —

“StarkWare brings scalability and privacy to blockchains with zero-knowledge STARK proofs. It is a permissionless, decentralized ZK-Rollup operating as an L2 network over Ethereum. And it achieves scale while preserving the security of L1 Ethereum by producing STARK proofs off-chain and verifying those proofs on-chain.” (source)

And yes, the entire graduating class of Technion has already pulled off this ambitious task. StarkNet, their L2 network over Ethereum, enables dApps like dYdX to create “the fastest and most powerful decentralized exchange ever” (source). Most importantly, the Stark L2 achieves their impressive scale (settling over 100M txs and over $380B worth of trades on Ethereum Mainnet) without compromising Ethereum’s composability and security.

The validity rollup used by Starkware is an elegant tool for getting more chain throughput without reducing security. The key is that, rather than putting data directly on the main ‘layer 1’ chain (Ethereum), the proof of computation (i.e. the STARK or SNARK) is what’s passed into the L1. The scaling comes from the fact that such proofs are incredibly quick to verify, regardless* of the circuit size (*constants and potentially log(n) terms involved).

The Layer 2, as referred to by both StarkNet and zkSync, is the network of compute nodes which are processing transactions and generating these proofs. Over time, the proofs are then batched and sent to the L1 (Ethereum mainnet), where each proof is verified and the state change is accepted.

Starkware’s roll-up solutions, however, are not without their drawbacks. And most relevant to us: Cairo (their proof-generating programming environment) only runs on CPU (read: STARKs are only generated by CPU code), which, for anyone from an AI/ML background, kills off any complex modeling dreams — see our back-of-the-envelope infeasibility argument at the very end for more!

The web3 community awaiting a post zk-rollup Eth

Step 2: CPU v. GPU

In theory, any computation can be efficiently verified (via SNARKs/STARKs), but so far nobody has done the work to port existing AI operations into these auto-proof-generating languages. This state of affairs may change somewhat soon — Giza, for example, is working on porting pretrained ONNX models into Cairo for verifiable inference. Such models, however, will be limited to CPU-only computation (since Cairo does not support CUDA-based STARK generation), despite the fact that most modern AI models must be run on GPU.

A quick crash-course: CPUs are simply not designed to run large AI models. Specifically, CPUs come with a small number (usually 4 or 8) of powerful physical cores (which might be further divided again into 2 or 4 virtual cores each), each of which handle a large number of processes by swapping processes on and off extremely quickly, giving the illusion of many processes operating simultaneously. Most deep learning (and SNARK-generating) operations, however, fail to benefit from the processing speed and generality of such cores, as the sequential version of such algorithms require a huge number of raw FLOPS.

GPUs, on the other hand, are designed specifically to speed up parallel processing. They come packed with thousands of cores — such cores, although not nearly as powerful as their CPU counterparts, allow for an entirely different programming paradigm.

Of course, not every algorithm can be easily parallelized, but it’s well known that deep learning models are composed of vastly parallelizable operations (feedforward, conv, transformer block, etc.), while certain cryptographic primitives (group exponentiation) which dominate prover computation costs for SNARK generation also show promising signs of GPU speedups.

Step 3: Putting it Together

Which brings us to our central thesis — if a roll-up service were to take advantage of a hardware accelerated prover solution (using GPUs or FPGAs, as examples) — this kind of breakthrough in mathematics + accelerator-enabled proof generation would be the key to achieving true AI on chain. In other words, if we can create a proof of our (hardware-accelerated) AI inference call, then we can demonstrate that we ran a specific model over a specific set of data. This means that others can verify our model without having to trust us, and without having to run the entire computation themselves, only needing to check the generated proof. In other words, we can have powerful AI on chain without giving up on the decentralized and trustless nature of crypto.

Step 4: Implications of “Real” On-Chain AI

The future — shaka to that!

The ability to put AI models on chain, without compromising on the decentralization and trustless nature of Ethereum, would be a huge leap forward for Web3. It brings dApps closer to parity in terms of features to their centralized counterparts, bringing recommendation and matching algorithms to Web3. It can be used for NFT marketplaces to better cater to the individual wallet owner, based on the NFTs they currently own. It can be used as an automated, trusted Oracle, for verifying off-chain data. And most exciting to us, it can enable new use cases that would have been impossible without on-chain AI support.

There are also Tokenomics possibilities, such as improving the quality of a model on-chain and being paid in tokens for the improvements. There are also more crazy ideas, like running a DAO based on an AI, or distributing Airdropped tokens based on the evaluation of participation by an AI. It’s hard to say how well these things will work before trying them, but we love trying crazy ideas, and we’re excited to see what happens when people have the tools to add more eccentricity to the on-chain world.

Of course, all these benefits rely on verifiable, GPU-accelerated AI, which is why we’ve tried to make a distinction in this post between CPU-based and GPU-based AI. The more compute you can dump into your AI, the better everything listed above will work, and GPUs simply have a lot more compute to offer.

Re-Visiting Our Assumptions

  1. Rollups dramatically increase tx speed and chain throughput while decreasing gas prices.
  2. Rollups provide privacy while preserving Ethereum’s decentralization and security.
  3. L2 compute is a network of physical nodes generating proofs to put onto L1.
  4. Cairo (STARKs) only runs on CPU.
  5. There will be a breakthrough in accelerator tech + mathematics enabling AI operations to be encoded in a proof-generating language.
  6. Any computation can be efficiently verified via SNARKs/STARKs.
  7. GPUs significantly speed up the generation of SNARKs/STARKs.
  8. Use case for on-chain AI: what are they?

New to trading? Try crypto trading bots or copy trading

--

--