Simulation of Incentive Design: What is the Most Appropriate Reward System? Part3

Jeffrey Lim
DECON
Published in
9 min readMay 31, 2019

Simulation of Incentive Design

Part 0: Why Simulation?
Part 1: The Problem of Reward System Design and Simulation Environment Problem
Part 2: Simulation Result Overview via Heatmap
Part 3: Simulation Result Analysis

Simulation of Incentive Design

In our previous post, we observed the simulation results using heatmaps as visual aid. Heatmaps are effective in showing multiple data in one image (agents, actions, probabilities, and episodes) but are not effective in displaying subtle changes and accurate numbers.

In this post, we seek to scrutinize the simulation results in detail by looking at numbers and graphs. In particular, we will look into the differences of the proportional, exponential, and uniform methods of gain distribution.

Mechanism Analysis Result

For the analysis, we will define two metrics. First is the review ratio. The review ratio indicates how many out of all the agents wrote reviews.

However, review ratio does not differentiate reviews written between 1 endeavor and 9 endeavor, and therefore, is inadequate in incorporating the level of effort.

This is why the second metric action ratio was introduced. Action ratio is the ratio of actual actions performed out of the sum of every possible action in one episode. If there were 10 agents and each performed [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] in one episode, the review ratio would be 90% while the action ratio would be 50%.

We are able to carry out a more precise result analysis by comprehensively interpreting the review and action ratios.

Result of the Proportional Mechanism

[Left] review ratio, [Right] action ratio

The result of the proportional mechanism is as shown above. We can see many agents wrote reviews as the review ratio converges above 90%. Yet, the action ratio is only about 50%.

This is because with the proportional mechanism, agents can reap gains by only investing a moderate amount of effort. Effort (endeavor) and the level of effort invested (action) are each drawn into return values of the cost function and gain function. Maximum reward can be expected by investing moderate amount of effort.

In the early part of the episode, there are sharp spikes and plummets of review participation because agents simultaneously write reviews after just learning that they could receive rewards but later give up participating because the reward amount shrinks due to the flood of participation. Later on, you can see everything working out towards a stable convergence point.

When Doubling the Reward Pool

[Left] review ratio, [Right] action ratio

Having a high review ratio but a low action ratio, in other words, means that agents do not feel the need to invest more effort. If the total reward is increased (if the size of the pie is increased), wouldn’t it drive participation since the distribution of gain becomes larger?

The above shows the result when the reward pool was doubled in the proportional mechanism. It is noteworthy that the review ratio increased from 91% to 97%. Agents found a reason to participate since the pie became bigger.

What is significant is that the action ratio rose from 48% to 67%. When the reward pool is bigger, agents are able to receive more gain by pinch of extra effort. The average endeavor level rose because agents figured this out and invested more effort.

Result of the Exponential Mechanism

[Left] review ratio, [Right] action ratio

Compared to the proportional mechanism’s review ratio of 91%, the exponential mechanism displays a significant drop to 65%.

This is because agents are not reward with enough gain with minimum endeavor. With the increased cost of writing a review, agents give up (action becomes 0) which leads to a sharp drop in review ratio.

Meanwhile, you can see that the action ratio in the proportional mechanism rises from 48% to 58%. Despite the drastic drop in action ratio due to more agents deciding not to write a review (action = 0), the increase in action ratio means that more endeavor was invested in writing a review than before.

When Doubling the Reward Pool

[Left] review ratio, [Right] action ratio

Couldn’t we compensate for low review ratio by increasing the reward pool? This was proven true through our simulation. Similar to the proportional mechanism, an increased reward pool leads to increased review ratio and action ratio.

Yet, the exponential mechanism displayed a more drastic change.

By doubling the reward pool size in the exponential mechanism, the review ratio rose by 21% and the action ratio plunged 25%. This is significant compared to the 6% and 19% rise in the proportional mechanism.

This entails that in the standpoint of the service provider or system designer who provides the reward pool, choosing the exponential mechanism gives them higher efficiency for the buck.

Result of the Uniform Mechanism

[Left] review ratio, [Right] action ratio

The uniform mechanism displayed the highest review ratio, which is because most agents were drawn to writing a review since the system guarantees a certain amount of reward for simply writing a review. However, we anticipated a very low action ratio since the reward is the same regardless of endeavor level. The result shows that all agents invested minimum amount of endeavor, rendering the action ratio to converge around 0.1. In other words, agents were only interested in receiving a reward rather than writing a quality review.

When Doubling the Reward Pool

[Left] review ratio, [Right] action ratio

Such problem cannot be solved by increasing the reward pool. It might be effective in attracting participation, but an increased reward pool would attract the very few who did not write reviews or those with low marginal utility.

Meanwhile, there is no change to the action ratio. In the proportional and exponential mechanisms, increasing the reward pool is effective for increasing the action ratio because more agents are incentivized to invest more to gain more. But for the uniform mechanism, only the cost to the agents grows while the reward stays the same — an obvious outcome.

Summary

[Left] review ratio, [Right] action ratio

Let’s compare the graphs above in a same scale. The review ratio was highest in the following order: uniform, proportional, and exponential. However, the action ratio was highest in the exponential mechanism, followed by the proportional and uniform mechanisms.

We can interpret that there is a sort of trade-off between review ratio and action ratio per mechanism. System designers should keep this in mind and select the optimum mechanism that suits their desired ecosystem.

For instance, the system designer should choose the uniform mechanism if number of reviews is more important than quality, the exponential mechanism for review quality, and proportional mechanism as a middle ground between the trade-off.

[Left] review ratio, [Right] action ratio

Despite increasing the reward pool twofold, there is no change to the rankings of review and action ratios. However, the exponential mechanism benefits the most from increasing the reward pool size. In the system designer’s perspective, the exponential mechanism is the most effective choice compared to their investment.

If the exponential method is the most effective for the buck, would it be possible to boost its effectiveness by raising the exponentiation? We conducted additional experiments using different exponents.

Exponential Mechanism Tuning

What would happen if we use raising to the 3rd, 5th, 7th, and 9th power instead of squaring in the exponential distribution method? Would the review ratio drop while the action ratio rises?

[Left] review ratio, [Right] action ratio

As the exponent rose, the review ratio dropped while the action ratio increased. However, after cubing, there were no significant differences.

Additionally, if you hold review ratio and action ratio in equal importance, you will learn that using 1.5 as the exponent will bring you the most satisfying result.

Then, which exponent renders the most investment effect? When the reward pool is enlarged, what exponent leads to the most spike?

When Doubling the Reward Pool

[Left] review ratio, [Right] action ratio

When increasing the reward pool, the review ratio grew the most when the exponent was 2, and the action ratio rose the most when the exponent was 1.5. When both increases are summed up, exponent 2 displayed more increase.

Simply put, the most effective exponent in the exponential mechanism is 2.

Conclusion

Through the series of articles addressing the ‘reward system design’ problem, we looked into which system is optimal using simulation. We ran simulations using the proportional, exponential, and uniform mechanisms of gain distribution and analyzed the results.

In our previous post, we saw the overall pattern with heatmaps. In this post, we took a closer look using numbers and graphs and found that there is a sort of trade-off between review ratio and action ratio in each mechanism.

Additionally, we learned that we can drive review ratio and action ratio by adjusting the reward pool size. With exception to the uniform mechanism, review ratio and action ratio increased as the reward pool was enlarged. The proportional mechanism had more increase than the exponential mechanism, meaning that the proportional mechanism renders more effect for the buck, which is something system designers should positively consider.

In the case of the exponential mechanism, both review and action ratios did not change significantly after the exponent was set to three or higher. If both ratios are considered equally important, the best exponent is 1.5 If investment effect by changing the reward pool size is more important, the best exponent is 2.

System designers or service providers should keep in mind of the different mechanisms and effects of adjusting the reward pool when selecting an appropriate system.

References

Written by Luke, Jeffrey @ Decon

About Decon

If you have any questions on the simulation or wish to participate in our project, please reach out to us at contact@deconlab.io.

Homepage: https://deconlab.io

Facebook: https://www.facebook.com/deconcryptolab/

--

--

Jeffrey Lim
DECON
Writer for

Interested in RL adaption to the real world. Building simulation models to validate the token economies with agent-based RL.