The doc.ai Cryptoeconomy:
Game Theory and Simulations

By Masha Titova
Researcher in Behavioral Economics and Economic Theory at UC san Diego.

As we move towards a more decentralized world, we must make sure that the platforms we create provide all actors with the right incentives. Computer scientists have done a lot of work to make existing platforms decentralized from a technical standpoint. At the same time, the field of cryptoeconomics, which studies economic incentives of the participating agents, remains largely ignored.
 
In this post we show how we at doc.ai used game theory to model stand-alone interactions between our platform participants to make predictions about how they will act in each particular situation. We then go further and model how our platform develops with time by simulating possible scenarios (sequences of stand-alone interactions), in each of which the agents stick to predicted behavior. We randomly generate thousands of possible scenarios of platform development in the early stages after launch and apply Monte-Carlo logic to see what we, as platform designers, can reasonably expect to happen in the first two years.
 
In the end, the combination of game theory and simulation allows us to make sure that

  • all agents behave well and are rewarded accordingly.
  • our platform scales rapidly.
  • we have fixed a centralized market inefficiency, and our platform provides agents with opportunities that were not available to them before.

What is doc.ai?

The doc.ai ecosystem is a platform that connects research sponsors who would like to run data trials with users who want to get paid for donating their data to said trials, and data scientists who want to earn money analyzing the collected data.

Studying the incentives: Game Theory

Game Theory provides mathematical tools to analyze interactions between agents and make predictions about what the outcome will be. Almost any interaction can be modeled as a “game” between players who have strategies, with all strategic interactions resulting in some payoffs. Many types of games have been studied (static and dynamic, with complete information and with incomplete information), and solved. A solution of a game is called an equilibrium (there are different equilibrium concepts for different types of games), which roughly is a situation, in which all agents are playing strategies which they don’t want to deviate from.
 
Why use Game Theory? As creators of an ecosystem, we would like our agents to exhibit certain types of behavior. Most importantly, we want our users to provide correct data, and research sponsors to pay a fair price; we want all participants to stay active and engaged; we want everyone to enjoy their experience. In game-theoretic terms, we want to study conditions, under which cordial behavior like that would constitute an equilibrium.

Research Project on doc.ai: The Game

Research Sponsor wants to minimize the budget conditional on recruiting Nusers for the data trial to happen. Users want to participate in data trials if the participation reward is higher than the cost of collecting the necessary data (which includes the cost of taking required tests, transportation costs, opportunity cost of time, etc.). Data scientists wish to participate in the competition if the expected reward exceeds their best outside option (for example, they may join data competitions on other platforms).
 
We model this interaction as a dynamic game of complete information. First, the research sponsor chooses the user reward and the prize for the winner of the data science competition. Secondly, users simultaneously decide if they want to join the data trial. Finally, the data science competition begins, and while it runs, the data scientists decide if they want to participate, or not.
The game tree is shown in Figure 1:

This game has a very simple and intuitive (subgame perfect Nash) equilibrium. The research sponsor chooses the user reward to be the lowest amount of money that at least N users will accept; each user joins the study if the reward is higher than the cost of taking additional tests; the data science competition prize is chosen at the market level.
 
At this point, it is easy to see that if the platform already has enough users who have previously collected data necessary for this study, then the user reward will be very low. This is excellent news for everyone: users will still get some positive reward for their participation, even though their cost of participation is zero. The research sponsor, on the other hand, is paying an order of magnitude less than the market rate to recruit the users. Conversely, if not enough users on the platform have previously taken the required tests, then the research sponsor will have to compensate them in full, which will drive the user reward towards the market price of this study.

Simulating the Platform

How do we model the evolution of the platform in time? We already have game-theoretic predictions of what will happen within each individual research study. Unfortunately, while game theory is very good at providing robust predictions for outcomes of research projects, it’s not as useful for studying the platform on the aggregate level. This is when Monte-Carlo simulation comes to rescue.
 
The Monte-Carlo method is heavily used in mathematics, statistics, physics, and finance. It relies on the law of large numbers: if you make many (say, 10000) draws from a distribution and then average them, this average will be close to the expected value of the distribution. Similarly, if we randomly generate many possible paths of platform development and then average them, the resulting “average” path should be a reasonable estimate of how the platform will perform in expectation.
 
To go further, we need to specify what a path of development is. We assume that a research sponsor arrives at the platform once a month. Each arriving research sponsor is random, in a sense that he needs a random number of participants, requires a random collection of medical datapoints, and wants the data science competition to run a random amount of time. A path is a series of research sponsors. For example, a 24-month-long path will consist of 24 random research sponsors arriving one after another. Every path starts at launch, i.e. when there are some (pre-registered) users on the platform, but none of them have any data collected, yet.
 
What is so special about a path of platform development? Within each project, all agents will act according to the equilibrium we found above. Some users will optimally decide to join the study if the reward exceeds the cost of collecting additional data. This means that every month the number of users with some data collected, as well as the number of data points per user, will grow. The more data is already on the platform, the cheaper the next study may be.

Scenario 1: Budget for a Research Sponsor

Imagine you are a research sponsor, and you would like to run a data trial. For your data trial, you need 100 participants, and you need each participant to have data on their phenome, exposome and genome collected. The market price (cost of all tests + premium payment for participation) of recruiting one user is estimated to be $350.
 
Omics” data can be classified into 9 broad categories. Some of them are easy for the user to collect and enter into the system. For example, the exposome refers to user’s surrounding environmental factors, such as air and water quality, and all required data can be inferred from the locations where this user has been. Phenome is the biometric data, such as height and body type, and can be entered at no cost. On the other hand, collecting user’s genome data would require them to spend $100-$300 on a DNA test.
 
If you join the doc.ai platform at the launch date, when users haven’t collected any data yet, you would have to pay the market price of $350 per user to fully compensate them for their trouble. How would the user compensation change if you launch the study on the platform in six months, in a year, in two years? To answer these questions, we fix a “maturity” level of the platform, and then for every maturity level of up to 2 years we generate thousands of random paths. The results of averaging them can be observed in Figure 2:

The results are quite stunning: in six months, you would only have to offer half the market price, in a year — one third the market price, and in two years you would have to pay less than $50 per user! Simply because in two years there will have been so many studies conducted already that the users will have already collected all possible data, so you won’t have to reimburse them for it!
 
One may think that research sponsors will be ripping users off by paying so little. To address this concern, we simulate another scenario.

Scenario 2: User Earnings

How much can the users reasonably expect to make? Suppose that the minimum payment to a participating user is set to $15.
 
As before, we have simulated many different paths of platform development for maturity levels of up to two years. Our simulation results can be seen in Figure 3:

We can see that after 6 months most users on the platform will have made over $100, and after two years a large share of users will have made upwards of $250. These earnings may seem tiny in the grand scheme of things, but two things need to be considered. First, these are monetary earnings ON TOP OF being paid to collect the data and take various medical tests. As a free bonus, the users will have collected large amounts of data about their health, including some very expensive tests, and learnt the results of the data trials they had participated in. Second, these are the average numbers over thousands of simulations, and they completely ignore competitive advantages some users may have because they are more active (and hence participate in studies more often), or if they are “superstar patients”, and have a rare condition, or own some medical data that is very costly for others to acquire.

Scenario 2.5 Superstar Patient Earnings

Let us look at potential earnings of the superstar patient who has a rare condition. His competitive edge is that at the launch date he already possesses data that for other users is very expensive, or impossible, to
obtain:

We can see that the superstar patient does very well, way better than any other user. He may have to wait a while until the study that requires the data he has arrives at the platform, but it is well worth the wait.

Concluding Remarks

This has been the first attempt, to our knowledge, to apply both game theory and Monte-Carlo simulation to study a crypto economy. This has been a very rewarding process because now we have a much better understanding of what our agents will want to do, and we are aware of possible scenarios of how our platform may evolve in the coming two years.

We hope that our contribution to the field of cryptoeconomics will encourage more research in this direction and provide token economy designers with ideas on how to properly incentivize behavior of agents on individual level, and stress-test possible scenarios of cryptoeconomy development on aggregate level.