Simulation of Commission Design:
Part 2

Jeffrey Lim
DECON Simulation
Published in
7 min readJun 17, 2019

Part 1: Commission System Design & Introduction of Simulation Environment

Part 2: Simulation Result Analysis

Let’s look back at our previous article ‘Simulation of Incentive Design: What is the Most Appropriate Reward System?’ We’ve learned that the review rate and endeavor levels changed in accordance with the chosen reward distribution method.

This time, we will see whether deals are completed and how the completion time changes depending on the chosen commission point distribution method.

Simulation Result

In the simulation, we compared six commission point distribution methods (hereinafter mechanisms): random, uniform, increasing, decreasing, convex and concave types.

  1. Random: Points are distributed whenever a buyer purchases any number of goods.
  2. Uniform: Points are distributed proportional to the amount purchased at any given time..
  3. Increasing: Buyers receive more points when they buy later.
  4. Decreasing: Buyers receive less points when they buy later.
  5. Convex: Buyers receive more points when they buy in the beginning or right before timeout.
  6. Concave: Buyers receive more points in the middle point of a deal.

Depending on the selected mechanism, the success of a deal and speed of deal completion change.

Each agent would aim to buy at a point when they can gain the most points. It may seem that in the increasing mechanism, deals would not be completed because buyers wait until somebody buys or take a long time to be completed. On the other hand, the decreasing mechanism would see a fast sell out.

We will see if these two assumptions are correct by analyzing the simulation result below. We conducted 30 types of deals with different ‘remaining goods” and “prices”, and laid the average results on a graph.

Results and Interpretation of Each Mechanism

Random

Unlike others, the random mechanism displays no correlation between remaining goods and points.

(Left) Deal success rate, (Right) Deal conclusion time

The left graph is a record of deal success rates in every 20 episodes. Because we performed 1,000 episodes, the graph shows 50 dots. The graph on the right illustrates the average conclusion time of each deal.

In the early phase of the agents’ learning, the deal conclusion time decreased as the deal success rate increased. However, as the agents learned more, the deal success rate dropped and the conclusion time became a bit longer. This means that while agents participate in deals even though they see losses in the early phases, they learn not to take part in deals as time progresses.

Uniform

Next is the uniform mechanism. This mechanism gives a uniform amount of points to every purchase.

In such mechanism, buyers would seek to buy later to minimize the time away from their deposit money. Because all buyers will choose this strategy, the common expectation would be that the deal falls apart because buyers wait until the last minute.

(Left) Deal success rate, (Right) Deal conclusion time

However, unlike our assumption, the success rate becomes high, surpassing 0.9. Also, the average deal conclusion time shortens to 500ms.

Our interpretation is that buyers actively participate in deals because their opportunity cost for not receiving rewards from not buying is bigger than minimizing their deposit time.

Decreasing

Third is the decreasing mechanism. As the remaining goods decreases, so do the distributed points.

In such circumstance, buyers would join in early to obtain more points. However, because the available points would be minimal in the end, the deal could potentially be unsuccessful. Let’s see how the decreasing mechanism impacts success rate and conclusion time.

(Left) Deal success rate, (Right) Deal conclusion time

You can see that the deal success rate is significantly lower than other mechanisms. Our understanding is that buyers do not take part in the deal late because there is little to gain, rendering the deal to be unsuccessful.

Additionally, the agents learn that the deal is bound to fail, and lean towards not incurring costs of depositing money in the first place.

Consequently, nobody participate in deals, and more deals become inconclusive even after the deal time elapses almost to timeout, causing the conclusion time to become longer.

Increasing

Fourth, the increasing mechanism sees an increase in receivable points as the remaining goods decrease.

In this mechanism, buyers will all seek to join in later to gain more points. However, because minimal points are given in the beginning of the deal, low participation rate could jeopardize the deal’s success.

(Left) Deal success rate, (Right) Deal conclusion time

The increasing mechanism shows overall higher deal success rates and shorter conclusion times than the decreasing mechanism, but has less impressive results compared to the random and uniform mechanisms.

In the beginning, agents seeking gains actively join in, leading to high success rates and short conclusion times. However, agents learn through time that participating earlier is not advantageous, and deals are unsuccessful.

A bottleneck section that delays purchase of goods exists in the early part of a deal. Yet, it has a higher success rate because of higher possibility of mitigating this than the decreasing mechanism.

Convex

Fifth, the convex mechanism displays a drop and rebound of receivable points as remaining goods decreases. This mechanism would be used when attracting buyers in the beginning and the end is the priority.

Based on the above cases, this mechanism will only be successful when the points given in the middle the deal is not significantly little. Below are the deal success rate and average conclusion time graphs.

(Left) Deal success rate, (Right) Deal conclusion time

The convex mechanism has a success rate lower than the increasing mechanism, but higher than the decreasing mechanism. This is because the bottleneck exists in the middle, and the possibility of mitigation is also medium.

Concave

Lastly, the concave mechanism provides the most amount of points in the middle and lowest in the beginning and the end.

(Left) Deal success rate, (Right) Deal conclusion time

From the analysis of other mechanisms, we know that to boost the success rate, the bottleneck section (with the lowest reward) has to be mitigated. The concave mechanism has two bottleneck sections in the beginning and the end. Therefore, the success rate and conclusion time is comparatively lower than others.

Summary

The graphs are comparisons of the deal success rates and deal conclusion times.

(Left) Deal success rate, (Right) Deal conclusion time

We can see that the success rate ranking from highest to low is uniform, random, decreasing, increasing and concave mechanisms. The conclusion time ranking is the reverse of that. In other words, to decrease the conclusion time, the deviation in points has to be minimized. In terms of purchasing, having a bottleneck section early on is advantageous.

The table below displays the standard deviation of commission points. It lists the standard deviation of the commission point ratio when 10 agents in order buy 10 goods. Through this, we can assess who discriminately points are distributed.

You can see that the uniform and random mechanisms have the lowest standard deviation. Such flat reward distribution is the key to bearing high success rates and short conclusion times.

Conclusion

We have learned the changes in deal success rates and conclusion times depending on reward mechanisms.

The uniform mechanism is the most efficient in the perspective of the designer because it has high success rate and short conclusion time. The result is contrary to our result from our previous article ‘Simulation of Incentive Design: What is the Most Appropriate Reward System?’

The reason for this is because while the inactivity of other agents did not affect an agent’s reward in our previous research simulation, this time, an agent’s reward is exposed to impact from the activity or inactivity of others.

With such change in agent interaction, the appropriate reward system also changes. The system designer must fully understand the characteristics of the system before designing a suitable reward system.

Written by Luke, Jeffrey @ Decon

--

--

Jeffrey Lim
DECON Simulation

Interested in RL adaption to the real world. Building simulation models to validate the token economies with agent-based RL.