A Decentralized Protocol for Teaching AI
When my elementary school kid asks me for help in math, he really makes my day. However, last week, he asked for help writing an essay, and I found myself looking for excuses and reasons why should he ask his mom or just grow up and be more independent!
I’m sure that many of the parents among you have faced similar situations. We are not expected to teach our children everything by ourselves. In some cultures, people have been sending their children to schools for thousands of years. But an AI developer does not have such luxury: we teach our AI modules (learnwares) by ourselves. We cannot send them to school, nor can we send them to tutors. The reasons why we can’t simply create an AI “school” are various and practical: every learnware is different, so how would a teacher know what and how to teach the learnware student? How would you reward your learnware’s teacher? How would you trust him? Why would he trust you? And many more…
Here, we present here a new approach to AI schooling with some preliminary code samples based on the Dopamine network, allowing several entities to teach one entity’s learnware, while being rewarded fairly.
The main reasons for not being able to send our learnwares to schools are:
A. Reasons related to lack of common language and protocols when dealing with AI.
B. Reasons related to trust.
C. Reasons related to rewarding and payments.
The three classes above are going through significant changes:
(A) With the growth of the AI industry more and more individuals “talk” and “communicate” AI, and start using the same terms.
(B&C) The blockchain technology enables building trustless solutions.
Sample Toy Case
In the sample below (full code available here; checkout the bootcamp for previous samples) we have a single student that is interested learning how to recognize handwritten digits from the mnist dataset:
The “student” has 10,000 labeled images that are used only for testing and evaluating the teachers. The “teachers” together have 60,000 other samples.
We show that initially the student is not very successful:
The dynamics of teaching-rewarding a tutor from the student’s perspective can be described as follows using Dopamine’s modeling language:
Logarithmic Reward Scale
Since the student is being taught by several tutors, it needs to “declare” a fair and transparent rewarding system. Using the Dopamine network, a student can, for example, declare that the total reward to be given in the current session is 1000 DOPAs if the score achieved is 1.0 and that the rewarding scale is logarithmic, where half of the reward will be given for raising the student’s score from 0.99 to 1.0:
Running The Sample
Code for running different Dopamine nodes (1 student + several teachers) is quite simple, and available here.
Eventually the student reaches an accuracy score of less than 99%, and therefore it pays less than 500 tokens reserved for this training budget. one can see the rewards and improvements attributed for every teacher below. As expected, teachers that brought more meaningful data and came first have an advantage when earning rewards:
Clearly, as shown below, the student eventually succeeds in recognizing the first characters:
What’s Not Shown Here Yet
The sample above was simplified and does not yet include other mechanisms we are working on these days, including trust mechanisms where each side (student / teacher) is less dependent on trusting the counter party.