Incentivai: decentralised oracle case study
This post presents results of testing incentive structures of a simple decentralised oracle system. For introduction to the concepts behind Incentivai, please see the first two blog posts or the concept paper.
Thus, I will refer to the subject of this introductory analysis as vanilla prediction market. More precisely, it can be described as decentralised oracle or Schelling scheme with no commit-reveal¹.
Vanilla prediction market
Rules governing our vanilla prediction market are very simple:
- goal is to reach consensus on a value
- users vote (bet) on what they think the value is (assuming no commit-reveal scheme so votes are potentially public)
- median vote becomes the outcome
- middle X% of voters closest to the median get rewarded, the rest penalised
There are many potential issues with such a simple setup. Some examples:
- users might not be incentivised to participate at all
- possible to set up bribes to rig the market and split rewards
- possible to set up voting pools (delegate votes)
- users might be incentivised to blindly follow the majority
It is difficult to reason a priori about the relative importance and prevalence of the above issues. Undoubtedly, there are also many other potentially missing from the list. It is also hard to predict how they are affected by varying parameters of the system (% of users rewarded, vote pricing, reward distribution, etc.).
Incentivai helps to answer some of the above questions by observing the behaviour of AI agents simulated in an environment that mimics what happens once smart contracts are deployed onto blockchain and widely used.
In this analysis, the two issues that were found to be dominating were reluctance to participate and high prevalence of following the majority. In comparison, the possibility of setting up bribe schemes wasn’t as significant.
When designing a prediction market system, it is necessary to effectively address both of those problems. Large number of voters who act individually and independently are some of necessary conditions to take full advantage of the wisdom of the crowds.
Implementing lower prices for early voting and varying reward distributions prove to be examples of effective ways of tackling the above problems. What follows is a detailed, quantitative analysis of those findings.
Simulation environment is populated with a large number of static agents whose behaviour follows certain heuristics. In order for the analysis to be robust, experiments sweep over various setups (see separate sets of graphs below for users increasingly honest, users increasingly following each other, etc.).
A smart agent is also present in the environment. Their action choices (made by a machine learning model) are the basis for reasoning about the incentives created.
The following actions can be taken by agents during simulations:
- honest: agent votes individually to the best of their (noisy) knowledge
- majority: agent votes by following how others have voted so far
- research: agent incurs extra cost to cast a better-informed vote
- bribe: agent votes and offers a bribe to other agents who agree to follow them
- skip: agent chooses not to vote
Results and analysis
Results are presented as screenshots of what can be thought of as an early version of the Incentivai testing tool UI.
The first setup analysed is one where 80% of users get rewarded (40% either side of the median), vote price is constant and reward distribution is uniform.
The smart agent chooses to skip close to 80% of the time. That means that, for the most part, users aren’t incentivised to participate at all. Low potential reward (80% splitting deposits lost by 20%) does not compensate for the risk of ending up in the outside 20%. Next two graphs show that behaviour in more detail.
The smart agent considers participating only after at least 70% of voting time has passed. As more have voted, reasoning about where the median outcome might end up becomes less risky. At that point, following the majority guarantees the reward, thus is preferred to voting honestly.
When it comes to action prevalence as a function of % of votes clustered closely together, there are two forces at play. When votes are more spread out (high variance, to the left on the X axis), following the majority is risky because there is a lot of uncertainty around where the median might eventually end up. Conversely, when most votes are clustered very closely together, following the majority is also risky because in a very dense cluster, some ordering must be imposed. Therefore, some voters that are very close to the median in absolute terms might still not make it into the middle X% that get rewarded.
As a result, there is a sweet spot in the middle where the smart agent is slightly more likely to participate. Its location moves to the right (towards more dense vote clusters) as users follow majority more. Note that a similar effect, although, as expected, to a much lesser extent, can also be noticed in the previous graph.
As a potential remedy for reluctance to participate, linear early voting discount is introduced. Users pay half the maximum price if they vote half-way through voting time, quarter of the price if 25% of voting time has passed, etc.
As the graph above shows, skipping has disappeared almost entirely which means introducing early voting discount increased participation from about 20% to nearly 100%. Excessively following the majority is still an issue which will be addressed in later section. It is interesting to first understand how such a drastic change in participation was brought about.
To decompose and analyse the effect of linearly increasing the vote price, the graphs below show results for early voting discounts introduced as a step function. Price reduced to X% for the first X% of voting time and left at 100% for later votes.
Let’s focus on the middle graph in the bottom row. We can see that with the vote price reduced to 60%, it starts to make sense to vote around 40% into voting time. At that point, risk is reduced to levels acceptable for that price. That naturally holds until 60% into voting time when price jumps back to 100%. Paying that price becomes justifiable when around 80% of voting time has passed. Once one commits and votes at that point, the price is high and one still runs a small risk of not getting rewarded. That, however, is compensated for by the fact that votes cast later add a lot of value to potential reward pool.
Following the above logic, one can see how increasing the price linearly with voting time (with a lower bound set on extremely early voting) incentivises full participation.
With 50% of users rewarded, reluctance to participate becomes slightly less of an issue. Let’s turn our attention to the other of the two problems mentioned before that is excessive following of majority.
A potential solution to the problem could be to change the distribution of rewards among users. Rather than use the uniform one, one could use linear, quadratic or cubic (as shown below) which increasingly penalise votes cast very close to the median. That could create incentives against following the majority and in favour of voting individually to avoid ending up very close to the median outcome. Simulations allow us to check how such scenarios play out.
Graphs below show results when reward distribution applied is increasing linearly away from the median.
What one can observe is that indeed the prevalence of majority has decreased. At the same time, however, participation has dropped slightly. This could be expected since with a much less predictable reward distribution, the previously safe option of simply following the majority is now much more risky. This also manifests itself in balances achieved by the smart agent (compare blue curves in top row graphs in the two cases).
The table below shows the trade-off between the prevalence of following the majority and participation for various reward distributions.
It is clear that there is a price in participation rate one must pay in order to reduce the extent to which users simply follow each other. The linear version seems to be offering the most attractive combination.
Early voting discount mentioned earlier can be combined with the appropriate reward distribution to give the best of both worlds. The prevalence of following majority in the bottom row is significantly reduced while high participation rate is retained.
When designing mechanisms for smart contract systems, there is a huge number of potential issues that should be identified and a lot of potential solutions for each one of them.
The simple analysis above showcases how Incentivai helps with both of those tasks. It generates quantitative and actionable results that guide iterative improvement of incentive structures.
¹ Lack of commit-reveal (public voting) can be thought of as worst-case-scenario approach to users potentially publishing their votes off-chain.