Keeping Honest: Solving the Oracle Problem

Smart contracts have become a new paradigm of writing programs on the blockchains. However, like all computer programs, smart contracts are only as useful as their inputs and outputs from the outside world.

An example of such smart contract is World Cup betting. Let’s say we open up a betting pool for England vs Croatia in the World Cup Finals. Everyone sends their Ether into the betting smart contract, which records everyone’s bets. This smart contract then requires the result of the match from the outside world, because it has no knowledge of the game. The creators of the smart contract could opt to use the official FIFA website for match results. When a result is announced by the official FIFA website, the smart contract is triggered to release funds to the winners of the bet.

In this scenario, the financial outcomes of the participants hinges on the data source for the match’s result, therein lies the problem. That data source is thus a trusted third party, who could potentially behave in a dishonest way for its financial gain. For example, owners of the data source could take a bet on either side of the game, as long as they announce a result that favors them. This means that they are able to rig the bet to their favor each time.

If we aren’t able to rely on an outside data source, could we instead rely on the creators of the smart contract to report the results correctly? You can — except that this rolls us back to centralized reporting systems, which defeats the purpose of using a smart contract in the first place. This is the Oracle Problem in smart contracts.

The Oracle Problem

The Oracle Problem is this — no external data from the outside world can be brought into the decentralized systems without trusting a third party as an oracle. Once you have an oracle dictating the data inputs into your decentralized system, there is little to no benefit for your application being decentralized.

In my opinion, the Oracle Problem is the single largest problem to be solved in any programmable, decentralized system. Any system that manages to solve the problem in a trustless and secure manner will reap the benefits for years to come.

To solve this problem, instead of relying on trusted third parties as oracles in the system, many projects use methods that analyze the submissions of the participants of the network and their behaviors. This blog post explores some of these methods of arriving to a correct data input.

Schelling Point

Schelling points are focal points of the system where the majority of the participants arrive at. An example of the Schelling point at work — ask 100 random people on the street what comes to mind when “San Francisco” is mentioned, you will find that a good majority of them will answer “Golden Gate Bridge”.

In a more concrete example, Vitalik Buterin introduced SchellingCoin. In this scheme, all participants will submit what they think is the correct input, but only the median of the submissions will be accepted as correct. For example, if we want to get the current price of Apple (AAPL) into the system, we would query the whole network of participants to input what they think is the current Apple stock price.

In the figure above, only submissions of Apple’s stock that are in the 1st quartile to 3rd quartile are paid for reporting, while the other reporters are not paid.

Why is the Schelling point so effective in arriving at the correct answer? There is a second-order effect here at play called the Keynesian Beauty Contest. Participants are incentivized to correctly guess what the other participants are guessing, since “fitting in” is the best strategy to play. In the absence of communication between participants, participants will go with their best bet to fit in — the truth.

However, this scheme also has some implications. First, accepting submissions that are between the 1st-3rd quartile means that if the majority of the participants go rogue and report that Apple’s stock price is $1.00, it will be accepted as the correct answer. This scheme ignores the fact of Sybil attacks. Should an adversary take control of majority of the system’s submission, they can set the median of the data, and essentially manipulate the market to their liking.

Second, to have the system work, the incentives of colluding with other participants has to be less than the incentive for reporting correctly. Bitcoin solves this problem by making it more profitable to mine new coins even in the event of collusion between miners. However, in a Schelling point system, it is unclear that the profits of reporting correctly all the time would be more than undermining the reporting system. This has to be evaluated on a case-by-case basis — which I will do so in the Augur forking example below.

Disputes & Forking

Prediction markets like Augur and Gnosis aim to solve the Oracle Problem. This is because a wrong outcome reported will hurt the financial well-being of honest market participants. To do so, they have dispute rounds for the participants to open a dispute about the reported outcome of a market. This dispute round goes on until a certain threshold of votes are achieved, which the network arrives at a fork.

For example, given a market with only two outcomes — will Donald Trump become the President of the United States? After the elections, the reporter reports correctly that Trump has indeed been elected as the POTUS. However, some market participants choose to go rogue and create a dispute in the system with sufficient votes. The network forks. There will be 2 separate forks — the fork that Donald Trump is correctly reported as elected President, and the other fork that he’s not elected.

These forks are totally disjoint and participants are not able to transfer tokens between forks of the network. In the event of the fork, token holders have to choose between the forks to migrate their tokens into. In this example, the Donald Trump fork and the Hillary Clinton fork. This forking event is final and has drastic consequences to the token holder should they fail to choose what the consensus wants.

Here, we rehash the Keynesian Beauty Contest, how can I, as the token holder choose the fork where my tokens will have the highest value? The fork where the majority of the token holders migrate their tokens into will naturally be the most valuable one.

There are also several factors to think about when migrating into the new fork. Will this fork have more market activity than the other fork? In prediction markets, traders are unlikely to trade in a fork which is founded on untruth, and subsequent trades would be marred by negative sentiment over the event. This in turn causes market makers to not create markets since trading volumes are low. Hence, token holders are incentivized to choose the fork where the truth is the law of the land.

Probabilistic Correctness

Probabilistic correctness is a method of arriving to the correct answer most of the time. One way of doing so is redundancy.

A common use case for achieving probabilistic correctness through redundancy is the outsourcing of computation tasks to untrusted computers. Let’s say an organization wants to delegate the task of scientific computations. The system will randomly assign k number of participants or nodes to perform the computation redundantly, and the final outputs are aggregated by the system.

For our example, we set k = 10, so there are 10 participants that perform the task redundantly. The system will first uniquely identify the outputs of all the participants, and group them together by their output. Let’s say the outputs are bit arrays, and there are only 3 distinct type of outputs provided by the 10 participants. They are labeled as Group A, B, and C.

GroupA = [0, 0, 0]
GroupB = [0, 0, 1]
GroupC = [0, 1, 1]

The system then counts the submissions for each group, sorting them by the count. Recall that we have a total of 10 participants. Our example has 5 participants submitting outputs of Group A, 4 for Group B, and 1 for Group C.

GroupA count = 5
GroupB count = 4
GroupC count = 1

Using this group submission count, the system can come to the conclusion that Group A is the majority (50% of the output count), and hence is a close-enough approximate to the truth. Here, the system can decide its cut-off point, what’s the percentage majority that needs to be achieved before choosing the correct answer? It could be 10% or 50%, depending on the output’s data sparsity. The higher the variance of the data, the lower the majority cut-off point.

Note that in this system, k participants is a parameter to be tuned by the buyer (or person outsourcing computation). To increase the probability that the system returns the most correct output, the buyer can increase the k, the number of participants that perform the computation. This will make the outputs more noisy and varied, but the resulting effect is that it will amplify the Schelling point, which is performing the computation correctly. To achieve a higher probability of correctness, the buyer will have to pay more.

However, this relies on the fact that there are more honest participants than dishonest participants in the system. Randomly selecting participants, which do not know each other to perform a computation may be useful for thwarting small collusion efforts, but it is not sufficient to remedy larger organizations forming cartels in the system.

Spot checks

Spot checks are less of a scheme or system, but just a method of encouraging honesty in the system as a whole. Spot checks are randomly set up honeypots in an outsourced task’s answer.

For example, the system randomly selects participants to input closing stock prices of a few companies (AAPL, GOOG, FB, AMZN) on the 8th of August 2018.

Companies = [AAPL, GOOG, FB, AMZN]
KnownClosingPriceIndex = ‘GOOG’
KnownClosingPrice = 1268.44

The system already knows or decided beforehand that Google’s (GOOG) closing stock price is 1268.44. This known answer has to be correct.

Next, the participants input the closing stock prices for the 4 companies. Below is an example of a participant’s submission

Answers = {
 ‘AAPL’: 207.21,
 ‘GOOG’: 1270.55,
 ‘FB’: 186.76,
 ‘AMZN’: 1886.44

Should a participant answer differently for Google’s closing stock price that is, not 1268.44, they are penalized for doing so and their stake could even be slashed. This fear of getting their stake slashed would be able to keep a percentage of the participants honest.

However, this is not sufficient for ensuring honesty in the system. Spot checks are easily gamed by spotting patterns in the system where the honeypot is always chosen. It is most effective when there the submission is large enough that the honeypot is hard to spot. This causes a strain to the buyer also — the buyer will have to supply honeypot answers to the network every single time they need a task to be completed.

Concluding Thoughts

An examination of all honesty schemes in varying coins and systems show that there is no proven method for keeping oracles completely honest.

The largest caveat of these schemes are that they only achieve some probabilistic certainty on whether an answer is true or false, but does not provide any guarantees like any proof systems do. This is because asking questions like “Who won the Presidential Election in 2016” is not trivial, and cannot be cryptographically proven.

Thus, blockchains or networks are better off relying centralized oracle systems for general use cases. For the weather or stock prices, there are existing trusted data feeds out there to be utilized with Oraclize on the Ethereum blockchain.

For niche use cases, system designers will have to thread with care for these experiments. But, if they do well enough, the landscape of subjective oracle systems could be revolutionized.

Thank you Julian Koh for providing feedback on this post.