Introducing Oracul: Decentralized Oracle Data Feed Solution for Ethereum

While building Ether Wager (Decentralized Contract for Difference Exchange, http://eth-wager.surge.sh/ — Kovan testnet), I faced a problem of choosing an oracle for assets price feeds. Fortunately, we have Oraclize, which brings the data from a consumer-specified API to a smart contract. But which API to select? Am I in position to choose a particular one? And, more importantly, can there be just one centralised service which all the traders would agree on?

In fact, this issue was faced by a number of Ethereum projects. Shapeshift built a custom centralised price feed for their exchange Prism. MakerDAO implemented an oracle controlled by the team’s multisig for their first stablecoin Sai. The oracle is often the only centralised unit of such systems.

Prediction markets, such as Augur and Gnosis, can theoretically be used as price feed providers. In practice, however, it’s unlikely to be feasible soon due to the overhead added by the generic nature of these projects.

SchellingCoin

In 2014 Vitalik Buterin described a SchellingCoin concept, addressing this very issue. The basic idea is that every participant reports a value, and the median is taken as the oracle result. All the reporters should provide a deposit along with the report, and the deposits are redistributed in favour of the values that are closer to the median. Thus, it’s in the reporters’ interest to post values that are similar to what others report. The only obvious common point (Schelling Point) is the actual true value, so the median is expected to be close to the real asset price.

The Oracul system is based on the same concept.

Domain-Specific Token

Oracul system makes use of a domain-specific token (ORC). It acts as voting power when reporting values and as stake size when distributing commission. The deposits are also nominated in ORC.

One of the limitations of such system is the possibility of collusion attacks. While it is technically possible, it doesn’t seem very attractive due to the fact that performing it would require heavy investment in ORC, and the attack itself would result in significant depreciation of the token. Thus, the arguments that back PoS (i.e. “fake altruism”) apply here as well. For similar reasons, having domain-specific tokens encourages users not only to report similar values (to stay closer to the median), but values that are objectively true.

Penalties and Rewards

In the original SchellingCoin proposal the reports that fall outside of the middle 50 percentile are penalised in favour of the reports that are closer to the median. In practice, however, more often than not all the reporters are benign and report the values that are true to the best of their knowledge. In such case, penalising the half of the participants is suboptimal.

In Oracul we choose a fixed Δ that represents a spread tolerance range for the reported value. If Δ == 0, every report except the median is penalised proportionally to its distance from the median. If Δ > 0, all the reports within this range are considered valid and get share of penalties of the reports that fall outside of this range.

For the values that have a natural, exact true value we can choose a very low Δ. For others (e.g. ETH/USD that can be different depending on exchange), we can take a larger Δ. For example, for Δ == 5%and median == 100, all the reports between 95 and 105 are considered valid and get their shares of the commission as well as the deposits of other reports. Still, the closer the reported value is to the median, the bigger share of the penalty and commission it is eligible for.

Reporting Pools

An ordinary token holder is not supposed to have enough time and resources to manually report the price every round. We need a way to make it easy for the participants to report the price in a reliable and convenient way. To achieve that, on-chain reporting pools can be built. A reporting pool is a smart contract that reports values that are computed based on a transparent, agreed in advance algorithm. For the ETH/USD oracle there will be reporting pools linked to the major exchanges. The implementation is straightforward: the pool resolves the value from an exchange API (e.g. with Oraclize) and forwards it to Oracul. Token holders deposit their stakes to one or multiple reporting pools. Later they can withdraw their stack along with their share of commission and the earned/lost tokens, in accordance to the pool’s performance.

The reporting pools are expected to form the overwhelming majority of the voting power. In practice, there are not many sources of truth for the values reported. Therefore, there is no need to spend extra gas sending the same value multiple times. We build a reporting pool for every exchange and let token holders decide which exchange price do they want to report. Essentially, we use the wisdom of crowd to make this hard decision of choosing a particular API to get the price feed from.

Some of the reporting pools might report aggregated values (e.g. from Coinmarketcap, providing an average price from different exchanges, weighted by volume). Many token holders are likely to choose such a reporting pool, which is reasonable, but makes the corresponding website’s API a single point of failure. Therefore, alternative reporting pools will be introduced: they should report the same (or very similar) value, but it should be computed without relying on the same website API.

Having reporting pools also benefits transparency: it is clear which share of voting power belongs to which pool (i.e. which exchange API / algorithm). If it’s apparent that too much value is concentrated in a single reporting pool, the token holders are supposed to shift their tokens to another pool that reports, perhaps, the same values but acquires them in a different way. Otherwise the holders risk depreciation of their tokens.

Implementation

On-chain implementation of such a system is challenging. It’s crucial to keep the gas costs low, making sure that the block gas limit is never exceeded. To achieve that, all the operations in current implementation are at most O(ln(n)) where n is the number of different values reported in active round.

Reporting Round Structure

  • Bets are placed along with the deposits in ORC;
  • median is resolved;
  • penalties and multipliers are computed for every participant (see below);
  • once all of them are computed, they are distributed along with commission in accordance to the multipliers;

The round structure described above consists of a few actions, although they can be performed within at most two requests per reporter per round. This is achieved by performing the previous round finalisation and reporting of a new value within a single smart contract invocation. To claim the commission and the penalties collected a new call is needed, as it can only be done when all the multipliers and commissions are finalised.

Data Structure

We need a data structure that stores the votes along with their weights. It should have the following properties:

  • The weighted median can be computed in O(ln(n))
  • New votes can be added in O(ln(n))

The data structure that is used in the prototype is a Weighted Order Statistic Tree. It meets the complexity requirements. It is implemented on chain. A Red-Black tree is chosen as base data structure (the rebalance itself is not implemented yet). In addition to the colour, every node stores the size of the subtree rooted in this node. In a classical Order Statistics Tree this number represents the number of nodes, but we use the sum of weights of nodes, so that we can efficiently compute the weighted median.

Computing Penalties and Rewards

For the reports that fall outside of the spread tolerance range the penalty is computed as the difference between the reported value and the closest edge of the spread tolerance range relative to the median.

The rewards are distributed proportionally to the distance of their report to the median. First a reward multiplier is assigned to each report depending on this distance:

  • The reports that fall exactly on the median get maximum share of the reward pool (1x)
  • The other reports get lower share. The ones that are exactly in the middle between the median and the edge of spread tolerance range only get half the rewards (0.5x)
  • The reports on the edges of the spread tolerance range don’t get any rewards — 0x. Please note that they are also not penalized, as described above.

Then the total reward pool is distributed among this reporters according to multipliers set.

In the following example Δ == 5% and the final median is 100, so the spread tolerance range is [95,105]. The commission accumulated is 1.

The total penalty is 1 * 0.07 + 1 * 0.05 + 1 * 0.45 = 0.57, the total multiplier is 0.2x + 0.6x + 1x + 0.8x + 0.6x = 3.2x. Thus, the total payout for the reporter with multiplier 0.2x is 0.57 / 3.2 * 0.2 = 0.035625.

Current State

I’m building a prototype of the described system: http://oracul.surge.sh/ (Rinkeby testnet, metamask recommended). It supports the following features:

  • Value Reporting and storage on a Weighted Order Statistics Tree, implemented on-chain;
  • Computing and distributing penalties;
  • Reporting Pools for major exchanges;
  • Constant Reporting Pool (always yields the same value);
  • Round and Shareholder explorer (details on every report for every round);
  • Visualisations: Data structure, Shareholders, Historic values.

Key features that are not implemented yet:

  • Tree balancing (Red-Black tree implementation);
  • Spread tolerance range is currently locked at zero;
  • There is no commission handling.

Conclusion

I firmly believe that a decentralised oracle solution will greatly benefit the Ethereum ecosystem and I hope that the Oracul project has the potential to become one. The prototype is not finished yet, but I hope it lets the reader to understand the proposed approach better and get an idea how it’s going to look like in practice. I’m especially interested in the comments regarding the theoretical foundation of the project. Are you convinced with the spread tolerance range idea? Do you like the reporting pools concept? Can you perhaps see some attack vectors? I’m looking forward to the feedback from the community!