The Convergence of AI and Web 3: Opportunities and Challenges

Mohamed Fouda
Alliance
Published in
13 min readApr 20, 2023

--

By Mohamed Fouda and Qiao Wang

Since the launch of chatGPT and GPT-4 shortly after, there is no shortage of content on how AI can revolutionize everything including Web 3. Developers in multiple industries have reported significant boost in productivity ranging from 50% to 500% by leveraging chatGPT as a co-pilot to automate tasks such as generating boilerplate code, conducting unit tests, creating documentation, debugging, and detecting vulnerabilities. While this article will explore how AI can enable new interesting Web 3 use cases, its main focus is on the mutually beneficial relationship between Web 3 and AI. Few technologies have the potential to significantly impact the trajectory of AI, and Web 3 is one of them.

There are several interesting startup ideas in the intersection between AI and Web 3. At Alliance, we look forward to supporting founders building products in this sector. Builders in this space are encouraged to reach out to me or Qiao Wang for feedback and discussions on this topic.

How can Web 3 benefit AI?

Despite their great potential, current AI models face several challenges around data privacy, fairness of the execution of proprietary models, and the ability to create and propagate believable fake content. Some of the existing Web 3 technologies are uniquely positioned to address these challenges.

Proprietary dataset creation for ML training

One area where Web 3 can assist AI is in the collaborative creation of proprietary datasets for machine learning (ML) training, i.e., PoPW networks for dataset creation. Massive datasets are essential for accurate ML models, but their creation can be a bottleneck, particularly in use cases that require private data such as medical diagnosis using ML. Privacy concerns around patients’ data pose a significant obstacle, as access to medical records is necessary to train these models. However, patients may be hesitant to share their medical records due to privacy concerns. To address this issue, patients can verifiably anonymize their medical records to protect their privacy while still enabling their use in ML training.

However, the authenticity of the anonymized medical records is a concern, as fake data can significantly impact model performance. To address this dilemma, zero-knowledge proofs (ZKPs) can be used to verify the authenticity of anonymized medical records. Patients can generate ZKPs to demonstrate that anonymized records are indeed copies of the original records, even after removing personally identifiable information (PII). This way, patients can contribute the anonymized records along with the ZKPs to interested parties, and even get rewarded for their contribution, without sacrificing their privacy.

Running inference on private data

A major weakness in the current landscape of LLMs is their handling of private data. For instance, OpenAI collects users’ private data when they interact with chatGPT and uses it to enhance the training of the model which leads to the leaking of sensitive information. This was the case for Samsung. Zero Knowledge (zk) technology can help address some of the issues that arise when ML models performs inference on private data. Here, we consider two scenarios: open-source models and proprietary models.

For open-source models, the user can download the model and run it locally on their private data. An example of that is Worldcoin’s plan to upgrade World ID. In this use case, Worldcoin needs to process the user’s private biometric data , i.e., user’s iris scan, to create a unique identifier for each user called IrisCode. In this case, users can keep their biometrics private in their devices, download the ML model for IrisCode generation, run the inference locally, and create a ZKP that their IrisCode was created successfully. The generated proof guarantees the authenticity of the inference while maintaining the data privacy. Efficient zk proving mechanisms for ML model such as those developed by Modulus Labs are essential for this use case.

The other scenario occurs When the ML model used for inference is proprietary. The task gets a bit harder because local inference is not an option. However, there are two possible ways in which ZKPs can help. The first approach is anonymizing the user data using ZKPs as discussed in the dataset creation case before sending the anonymized data to the ML model. The other approach is using a local preprocessing step on the private data before sending the preprocessing output to the ML model. In this scenario, the preprocessing step hides the user’s private data such that they cannot be reconstructed. The user generates a ZKP showing correct execution of the preprocessing step and the rest of the proprietary model execution can be done remotely on the servers of the model owner. Example use cases here can include AI-physicians that can analyze a patient’s medical records for potential diagnosis, financial risk-assessment algorithms assessing client’s private financial information.

Content authenticity and battling deepfakes

chatGPT may have stolen the spot light from generative AI models that focus on generating pictures, audio, and videos. However, these models are currently capable of generating realistic deepfakes. The recent AI-generated Drake song is a good example of what these models can achieve. As humans are programmed to believe what they see and hear, these deepfakes represent a significant threat. There are a number of startups that are trying to solve this problem with Web 2 technologies. However, Web 3 technologies such as digital signatures are better positioned to address this issue.

In Web 3, user interactions, i.e., transactions, are signed by the user’s private key to prove their validity. Similarly, content whether it be text, pictures, audio, or video can also be signed by the creator’s private key to prove its authenticity. Anyone can verify the signatures against the creator’s public address which is provided on the creator’s website or social media accounts. Web 3 networks have already built all the infrastructure needed for this use case. Fred Wilson discussed how the association of content to public cryptographic keys can be effective in fighting misinformation. Many reputable VCs are already linking their existing social media profiles, e.g. Twitter, or decentralized social media platforms, e.g., Lens Protoocl and Mirror, to a cryptographic public address which gives credibility to the use of digital signatures as a method of content authentication.

Despite the simplicity of that concept, there is a lot of work to be done to improve the user experience of this authentication process. For instance, the process of creating digital signatures for content needs to be automated to provide a seamless flow for creators. Another challenge is how to generate a subset of the signed data, e.g., an audio or video clip, without the need to resign it. Many of the existing Web 3 technologies are uniquely positioned to address these issues.

Trust minimization for proprietary models

Another area where Web 3 can benefit AI is minimizing trust in service providers when a proprietary ML model is offered as a service. Users may need to verify they are getting the service they are paying for [4] or get guarantees of fair execution of the ML model, i.e., that the same model is used for all users. ZKPs can be used to provide such guarantees. In this architecture the ML model creator generates a zk circuit represneting the ML model. The circuit is then used to generate ZKPs for user inferences when needed. The ZKPs can either be sent to the user for verification or be posted to a public chain that handles the verification task for the user. If the ML model is private, independent third parties can verify that the utilized zk circuit represents the model. The trust minimization aspect of ML models is particularly useful when the model’s execution result has high stakes. Examples include

ML medical diagnosis

In this use case, the patient submits their medical data to a ML model for potential diagnosis. The patient needs guarantees that the target ML model was applied correctly to their data. The inference process generates a ZKP that proves the correct execution of the ML model. [5] [6] [7] [8] [9]

Creditworthiness for loans

ZKPs can ensure that all the financial information submitted by an applicant is taken into account by banks and financial institutions when assessing creditworthiness. Additionally, ZKPs can demonstrate fairness by proving that the same model is used for all users.

Insurance claim processing

Current insurance claim processing is manual and subjective. ML models can fare better in fairly assessing claims considering the insurance policy and the claim details. Combined with ZKPs, these claim-processing ML models can be proven to have considered all policy and claim details and that the same model is used to process all claims under the same insurance policy.

Addressing centralization of model creation

Creating and training of LLMs is a lengthy and costly process that requires specific domain expertise, dedicated computing infrastructure, and millions of dollars in computing costs. These traits can lead to powerful centralized entities, e.g. OpenAI, that can exercise significant power over their users by gating access to their models.

Given these centralization risks, there is important discussions on how Web 3 can facilitate the decentralization of the different aspects of LLM creation. Some Web 3 advocates are proposing decentralized computing as an approach to competing with centralized players. The thesis is that decentralized computing can be a cheaper alternative. However, our view is that this may not be the best angle to compete with centralized players. Decentralized computing has the disadvantage that it can be a 10–100x slower in ML training because of the communication overhead between different heterogeneous computing devices.

Alternatively, Web 3 projects could focus on creating unique and competitive ML models in a PoPW style. These PoPW networks can also collect data to build unique datasets to train these models. Some of the projects moving in this direction are Together and Bittensor.

Payment and execution rails for AI agents

The last few weeks have witnessed the rise of AI agents that utilize LLMs to reason about tasks needed to achieve a certain objective and even execute these tasks to deliver on the objectives. The AI agents wave started with the BabyAGI idea and quickly proliferated into advanced versions including AutoGPT. An important prediction here is that AI agents will become more specialized to excel in certain tasks. If a specialized marketplace for AI agents exists, then AI agents can search, hire and pay other AI agents to execute specific tasks that lead to the completion of a main project. In this process, Web 3 networks present an ideal environment for AI agents. For payments, AI agents can be equipped with cryptocurrency wallets that are used to receive payments and to pay other AI agents. Further, AI agents can plug to crypto networks to commission resources permissionlessly. For instance, if an AI agent needs to store data, the AI agent can create a Filecoin wallet and pay for decentralized storage on IPFS. AI agents can also commission compute resources from decentralized compute networks such as Akash to execute certain tasks or even to scale its own execution.

Protection from AI invasion of privacy

Given the massive amount of data needs to train performant ML models, it’s safe to assume that any public data will find its way into ML models that can use this data to predict the behavior of individuals. Further, banks and financial institutions can build their own ML models that are trained on users’ financial information and would be able to predict users’ future financial behavior. This can be a significant invasion of privacy. The only mitigation of this threat is the default privacy of financial transactions. This privacy can be achieved using private payment blockchains such as zCash or Aztec payments and private DeFi protocols such as Penumbra and Aleo.

AI-enabled Web 3 use cases

On-chain gaming

Bot generation for non-coder gamers

On-chain games such as Dark Forest create a unique paradigm where players can achieve advantage by developing and deploying bots that execute the required game tasks. This paradigm shift can exclude gamers who cannot code. However, LLMs can change that. LLMs can be fine tuned to understand the on-chain game logic and allow gamers to create bots that reflect the gamer’s strategy without the gamer writing any code. Projects like Primodium and AI Arena are working on onboarding both human and AI players to their games.

Bot battles, wagers and betting

Another possibility for on-chain games is fully- autonomous AI players. In this case, the player is an AI agent, e.g., AutoGPT, that uses an LLM as a backend and have access to external resources such as internet access and potentially initial cryptocurrency funds. These AI-players can engage in wagers in a Robot Wars style. This can open a market for speculation and betting on the outcom of these wagers.

Creating a realistic NPC environment for on-chain gaming

Current games have little focus on non-player characters (NPCs). NPCs have limited actions and little effect on the course of the game. Given the synergies of AI and Web 3, it’s possible to create more engaging AI-controlled NPCs that can break predictability and make the games more fun. A potential challenge here is how to introduce meaningful NPC dynamics while minimizing the required throughput, in TPS, associated with these activities. Excessive TPS requirements for NPC activities can lead to network congestion and poor UX for the actual players.

Decentralized social media

One of the challenges of current decentralized social (DeSo) platforms is that they don’t offer a unique user experience compared to existing centralized platforms. Embracing seamless integration with AI can offer a unique experience that is lacking from Web 2 alternatives. For instance, AI-managed accounts can help with engaging new users to the network by sharing relevant content, commenting on posts and engaging in discussions. AI accounts can also be useful for news aggregation, summarizing recent trends that matches user’s inerests. [18]

Testing of security and economic design of decentralized protocols

The trend of LLM-based AI agents that can define goals, create code and execute these codes creates an opportunity for realistic testing of the security and economic soundness of decentralized networks. In this case, the AI agents are directed to exploit either the security or the economic balance of a protocol. The AI agents can start by reviewing the protocol’s documents and the smart contracts and identify points of weaknesses. The AI agents can then independently compete to execute mechanisms to attack the protocols to maximize their own gain. This approach simulates the practical environment that the protocol endures after launch. Based on these test results the protocol’s designer can review the protocol design and patch weaknesses. So far, only specialized companies, e.g., Gauntlet, had the technical skill set to offer such services for decentralized protocols. However, with LLMs that are trained on Solidity, DeFi mechanics, and previous exploitation mechanisms, we expect that AI agents can deliver similar functionality.

LLMs for data indexing, metric extraction

Despite the public nature of blockchain data, the indexing of that data and extracting useful insights has been a continuous challenge. Some players in this space such as CoinMetrics specialize in indexing data and structuring complex metrics to sell while others like Dune focus on indexing the main components of raw transactions and crowd-source the metric extraction part through community contributions. With the recent LLM progress, it’s clear that the data indexing and metric extraction can be disrupted. Dune has recognized this threat and announced an LLM roadmap with components such as SQL query explanation and potential of NLP-based querying. However, we predict that the impact of LLMs will be deeper than that. A possibility here is LLM-based indexing where the LLM model directly interacts with the blockchain nodes to index data for a specific metric. Startups such as Dune Ninja is already exploring innovative LLMs applications for data indexing.

Developer onboarding to new ecosystems

Different chains compete to attract developers to attract developers to build applications in that ecosystems. Web 3 developer activity is an important indicator of the success of a certain ecosystem. A major friction point for devs is getting support when they start learning and building in emerging ecosystems. Ecosystems are already investing millions of dollars in the form of dedicated Dev Rel teams to support developers exploring the ecosystem. On this front, emerging LLMs has shown mindblowing results in explaining complex code, catching bugs and even creating documentation. Fine-tuned LLMs can complement human experience to significantly scale the productivity of dev rel teams. For instance, LLMs can be used to create docs, tutorials, answer FAQs, and even support devs in hackathons with boilerplate code or creating unit tests.

Improving DeFi protocols

The performance of many DeFi protocols can be significantly improved by integrating AI in the DeFi protocol’s logic. The main bottleneck of integrating AI in DeFi until now was the prohibitive cost of implementing AI on-chain. AI models can be implemented off-chain but previously there was no way to verify the model execution. However, the verification of off-chain execution is becoming possible with projects like Modulus and ChainML. These projects allow the execution of the ML model off-chain while limiting the on-chain costs. In Modulus’s case the on-chain fee is limited to verifying the ZKP of the model. In ChainML’s case, the on-chain costs is paying the oracle fee to the decentralized AI execution network.

Some of the DeFi use cases that can benefit from AI integration.

  1. AMM liquidity provisioning, i.e., updating ranges for Uniswap V3 liquidity.
  2. Liquidation protection for debt positions using on-chain and off-chain data.
  3. Complex DeFi structured products where the vault’s mechanism is defined by a financial AI model instead of a fixed strategy. These strategies can include AI-managed trading, lending, or options.
  4. Advanced on-chain credit scoring mechanism that considers different wallets on different chains.

Conclusion

We believe that Web3 and AI are culturally and technologically compatible with each other. In contrast to Web2 which tends to be averse to bots, Web3 allows AI to flourish thanks to its permissionlessly programmable nature. More broadly, if you view blockchain as a network, then we envision AI to dominate the edges of the network. This applies to all sorts of consumer applications from social media to gaming. So far, the edges of Web 3 networks have, for the most part, been humans. Humans initiate and sign transactions or implement bots with fixed strategies to act on their behalf. Over time, we will see more and more AI agents at the edges of the network. AI agents will interact with humans and with each other via smart contracts. These interactions will enable novel consumer experiences.

--

--

Mohamed Fouda
Alliance

Crypto researcher and Investor. Contributor @AllianceDao, Venture partner @Volt Capital, PhD @Northwestern