(Part 2) Blockchain + AI: Combining Technologies for Advanced Capabilities

JL Marechaux
Technoesis
Published in
6 min readFeb 9, 2019

This is part 2 of AI-powered blockchain. Make sure you read it first to understand how AI can be used as an underlying technologies in blockchain platforms.

Blockchain-enabled AI

Artificial intelligence (AI) is meant to simulate human intelligence in order to conduct tasks. Machine learning (ML), a subset of AI, is a discipline that focuses on teaching systems to improve over time by learning how to be more accurate and efficient. If AI is becoming so popular in multiple industries, it is because the science behind it recently evolved, but also because compute power and cloud computing have made AI much more accessible to a larger audience. Can we envision some scenarios where AI could also benefit from blockchain technologies?

Decentralized platform for data and models

AI needs data, huge amount of data, to be able to come up with reliable recommendations and predictions. If companies like Google, Facebook, or Amazon are leaders in the AI space, it is mostly because they easily have access to petabytes of data. But for most companies, and specifically the small ones, it can be difficult to get meaningful data to train models. For these companies (thousands of them) willing to benefit from AI, there is a need for a more democratic access to data sources.

Blockchain is a platform to manage digital transactions. As data and AI models are digital assets, a new business model that leverages blockchain may emerge in the AI space. Data is the new oil, but just like in the petroleum industry, data may need to be refined and transferred in order to be accessible to consumers. In the AI ecosystem, some stakeholders want access to raw data (crude oil) while others may prefer trained models (refined petroleum) they can immediately leverage. A blockchain network could meet both needs. It could serve as a decentralized platform for a democratized access to data. It could also be leverage to manage AI trained models as digital assets.

Blockchain could be used to support an AI marketplace, where some stakeholders could provide access to data and models for consumption by other organizations.

Blockchain transactional data sources

AI must leverage large quantities of data to train, validate and test models. A blockchain ledger, which logs all peer-to-peer transactions, provides such datasets so that AI/ML can be applied to learn how the blockchain operates.

If a machine learning model has access to historical blockchain transactions, it is then possible to apply supervised and unsupervised learning techniques to predict some behaviors or to classify ledger information into data clusters.

We can also imagine reinforcement learning applied to blockchain, where an agent can learn how to act as a blockchain participant. A trained blockchain agent could then submit transactions or react to network events.

With access to blockchain ledger, it is another source of meaningful data that could be leveraged to train, test and enrich AI models.

Data confidentiality for AI models

As more and more regulations around the world are enforcing data privacy (HIPAA, GDPR, PIPEDA, etc…), sensitive information management is a major concern for most AI initiatives. How can we deal safely with large amounts of private data in order to train our models? Data anonymization is frequently applied for privacy protection, but the approach may sometime remove useful training information from datasets.

Another approach is to use encrypted data to train models. This relies on a relatively new technique called homomorphic encryption (HE), where models can be trained without exposing underlying data. IBM and Microsoft have released homomorphic encryption libraries (respectively HElib and SEAL), and last month, at NeurIPS 2018, Intel announced a tool to support AI training on encrypted data (HE-Transformer, based on SEAL).

From a blockchain perspective, security is also based on cryptography mechanisms. Some blockchain platforms are already exploring advanced techniques to leverage homomorphic encryption.

We may see blockchain evolve to provide HE capabilities and provide privacy-preserving data for machine learning

Preventing data corruption

Another problem in AI is to ensure that model are trained on relevant data. The quality of a model depends on the quality of the input data. And this leads to a major security concern in the AI world, where we must ensure that training datasets are not corrupted. If training data is modified over time by a malicious actor, an AI model can be flawed and become invalid (model bias). This is why consistency and traceability on training datasets is crucial.

A blockchain is a digital proof system that provides a traceable, immutable ledger. If we consider a dataset as a digital asset, a blockchain can be used to manage transactions related to training data. The immutability feature of a blockchain ensures that any transaction is logged and cannot be removed. The traceability capability of a blockchain provides information on any kind of update on the digital assets. In other words, if training data is modified, changes will be captured along with reliable information (who, what, when).

A blockchain could be leveraged to ensure the consistency of ML models.

Federated Learning

In a typical machine learning initiative, data needs to be collected and gathered in order to train a model. It is quite common to combine multiple datasets from multiple sources, and to enrich private business data with public data (weather, traffic, social media, events…).

This ML process becomes unrealistic when data owners (data source) want to keep data and prevent others from downloading it. The reason could be to ensure data privacy, to comply with specific regulation, or to keep a competitive advantage.

Federated Learning is a collaborative form of machine learning that could address this issue. Instead of downloading and combining datasets to train a unique model, multiple models are trained at the source, then models are combined to create the final model.

A blockchain framework could be leveraged to access distributed data. Multiple nodes could be involved in partial training activities before a complete model is assembled. Such a distributed training process can help support data privacy requirements.

A blockchain could provide a platform to support federated learning on distributed data sources.

Explainable AI

Machine learning systems are usually quite opaque, based on a “black box” that consumes data to provide a result without really explaining the rationale behind the process. Explainable AI (XAI) is a recent field of interest in the AI world, where the idea is to provide more transparency so that users trust AI systems. A lot of people believe that XAI is needed for a widespread adoption of AI because we, humans, have a tendency to distrust what we don’t understand. Moreover, with the advent of AI-powered systems in several industries, it is becoming critical to trace and explain AI decisions from a legal and ethical perspective.

XAI is not an easy concept because explainability if difficult to define and is quite subjective. As user of and AI system, what do I really need to understand? Should it be the basic building blocks of the cognitive process, or the specific underlying mathematical models? There is no universal answer to this question, and each human, depending on the situation may be looking for different information.

A blockchain, as mentioned earlier, provides a system of proof where transactions are logged, timestamped, and signed. If it is not an answer to all XAI needs, blockchain can at least be used to provide some traceability on AI-powered system. With a blockchain-enabled environment, it it would be possible to link a specific AI output to all the different steps involved in the decision process. Or to trace back to the training datasets in order to understand which specific piece of information have influenced the end result.

A blockchain could provide transparency and traceability for better AI explainability, governance, and transparency.

https://marketoonist.com/2018/01/blockchain.html

No doubt that there is a lot of hype around blockchain and AI. Blockchain technologies still need to mature and will not replace transactional systems anytime soon (Blockchain is Not a Silver Bullet). And AI will not reach human-level intelligence in the short term.

But think about all could achieve by combining these two disruptive technologies. AI-powered blockchain platforms could improve consensus mechanisms, fraud detection and smart contracts. In turn, blockchain-enabled AI systems could be beneficial for data access, privacy, security, and explainability.

--

--

JL Marechaux
Technoesis

Data Science & AI/ML at Google. My team is building advanced analytics and applied AI/ML models for large Google customers.