Democratizing Machine Learning with Blockchain Technology

Jonathan Ward
Fetch.ai
Published in
3 min readJul 16, 2020

The goal of the Fetch.ai collective learning system is to deliver the benefits of artificial intelligence more widely to individuals, small- and medium-sized businesses. This will reverse the trend of a handful of giant technology companies reaping most of profit from the machine learning boom. By working together across borders, these collective learning techniques will enable industries to achieve enormous improvements in efficiency without needing to rely on “Big Tech”. In this article, we describe the key ideas behind the Fetch.ai collective learning network.

Blockchains and Collective Learning

Cryptography is a key part of blockchain technology. Cryptographic signatures can be used to ensure that only the owner of a public key is allowed to transfer coins on open networks such as Bitcoin. The logic can be extended further to enable voting and many other types of applications to be implemented on programmable blockchains such as Ethereum and the Fetch.ai network. These voting procedures, as in real life, enable the different participants in the network to build a consensus on what actions they should take. By repeated rounds of voting, the participants can agree on updates that lead to progressive improvements in the accuracy of a shared machine learning model. Importantly, this can be achieved without any of the participants sharing their data, which is kept private throughout.

The key features that distinguish Fetch.ai’s collective learning system from standard blockchains is that the users, which we refer to as “learners”, also have access to local training data and the ability to train a machine learning model. This could involve any of the popular machine learning frameworks that implements the agreed learning algorithm and model architecture.

The collective learning model, shared by all learners, is initiated with a genesis model that contains random weights. This model initially performs poorly on all of the learners’ individual training sets. After the initiation step, the consensus protocol selects one of the learners to produce the first update to the model. This learner carries out a few steps of the learning algorithm to improve the performance of the model on their own local data set. The learner then broadcasts a message containing the new and improved model to the other learners in the network.

Upon receiving an updated model, the other learner evaluate its performance on their own local data set compared with its predecessor. The learner broadcasts a positive vote for models that have improved performance while rejecting updates that have degraded performance from their point-of-view. This process is then repeated many times with a different learner training the model in each epoch until a fixed number of rounds has been completed or a particular target has been met. Attackers that attempt to “poison” the model or learners that have data that is incompatible with the majority will not contribute to the learning process.

Fetch.ai collective learning system. 1. A blockchain is a decentralized network of computers that can be used to coordinate actions from multiple parties 2. The machine learning model is initialized with random weights, W0, which are entrered into a block. Learners then take turns to propose updates in the form of blocks, W1,W2,… 3. A model “poisoning” attack by “Eve” is repelled by the other validators rejecting her weight submission.

Preserving Privacy and Improving Performance

This blockchain-mediated collective learning system enables multiple stakeholders to build a shared machine learning model without needing to rely on a central authority. There are, however, many potential avenues for future improvements. We’re currently working on some important questions such as; “How are participants incentivized to behave well?”, or “Who pays for the on-chain data storage?”, or “What about the validators with data that is inconsistent with the others?”. Along with these issues we’ve also been improving the stability and efficiency of collective learning that we’ll be describing in future articles and source code releases.

In the this video, my colleague Emma Smith describes how the collective learning protocol can be used in the healthcare industry. In the future, we’ll explain how privacy-preserving techniques from the Deepmind-sponsored Openmined project can be used to protect the privacy of patients whose data is used in our collective learning system.

--

--