API3
Published in

API3

Staking & oracles

This is the fifth post in our series, “Getting APIs on the Blockchain”.

This particular article in our series might seem a bit out-of-place at first, so let me provide a bit of context. Previously we discussed the significance of APIs and the problem with connecting existing APIs to the blockchain. We then began to introduce some of the components of our solution via defining first-party oracles (and, in particular, comparing them with third-party oracles).

Now, why are we discussing staking? As we introduce the components of our solution to the API Connectivity Problem, eventually comes the elucidation of our chosen security model. In particular, we move away from oracle-level staking as an approach to security. This article gives a relatively short and informal argument as to why. (We refer you to our whitepaper for a more formal discussion.)

Visual representations of the Bohr model of the atom (left) and the quantum model of the atom (right). This is one of many examples in the sciences of the jump in complexity that occurs when we go from modelling a discrete and deterministic system to a continuous and stochastic one. (Photo Credit : dani3315/Shutterstock)

Staking

Although I am sure anyone reading this far into this article is probably familiar with staking, let me offer a quick refresher. The concept of staking originated with Proof of Stake¹, which offered an alternative to the Proof of Work mechanism that fuelled Bitcoin’s Satoshi consensus algorithm.²

Pardon my hand-waviness: in general, Proof of [X] is a part of a mechanism for selecting a miner to validate a number of transactions and then add them — in the form of a “block” — to the blockchain. With Proof of Work, the miner shows they did the necessary, computationally-intensive work (to create a new block) up front (and, importantly, were the first to do so). In contrast, Proof of Stake offers a “skin in the game” approach to selecting the next block creator: miners put up tokens (their “stake”) and are usually selected based on some function of the size of their stake. Importantly, this stake is “slashed” if the miner behaves maliciously and does not create an honest block .

Of course, there are plenty of details omitted in the above definitions, but I think this is sufficient for our purposes here. The important point being: staking in crypto/blockchain systems is used as a mechanism to punish malicious behaviour (and thus, vice-versa, incentivize honest behaviour).

Staking has since expanded in meaning — it isn’t just used in Proof of Stake blockchain networks— but its meaning in context can usually be extrapolated from the original definition. Namely, staking offers a mechanism to punish dishonest (or lazy, or risky) behaviour via direct financial loss and award honest, system-beneficial behaviour via direct financial gain.

A quick note on consensus

Photo: Rubberball/Mike Kemp via Getty Images

Certain malicious behaviour causes “splits” in the blockchain.³ Consensus arises through the a priori agreement — between participants in the network — that the longest blockchain is the correct one. The correctness of the longest chain is substantiated by the amount of work (Proof of Work) or amount of “skin in the game” (Proof of Stake) used to create it.

For completeness: this also gets at why double-spends are prevented (under certain conditions) in a blockchain: only one valid chain wins out in the end.

Quantum leap into continuous numbers

Detail of The Garden of Earthly Delights (1515) by Hieronymus Bosch, left side pixelated by author.

Decentralized oracle networks stretch the concept of consensus. We should be wary of considering oracles “agreeing” on a value (via some aggregation method) as remotely similar to consensus on the validity of transactions (as is the case with existing blockchain-based consensus algorithms).

There is a leap in complexity when we go from dealing with discrete data (often, and even more simplistically: binary decisions) to dealing with real-valued, continuous data that doesn’t natively live on the blockchain.

Let’s consider an example.

Detecting malicious behaviour in a price feed

Price feeds are currently the most widely-used example of off-chain data consumed on-chain. Consider the common architecture of a price feed served by n oracles with some aggregation contract.

What defines malicious behaviour in such a system? One could say: misreporting the price. But what does that look like? If the data doesn’t live natively on the blockchain, how do we know there was a misreport?

If we attempt to detect malicious behaviour at the time of aggregation, the only information we can use are the oracle responses themselves. That is, we can only compare the oracles responses to each other — since they are reporting off-chain data, we cannot compare their responses to any on-chain “gold standard”.

So this leaves us with “outlier detection”. However, critically: an outlier does not imply a misbehaving node.

To quote from a previous blog post:

Oracles have an incentive to gather cheap and easily accessible data because nothing is enforcing or incentivizing them to do otherwise (since, again, they are not enforced in any way to report their sources). This creates something of a Schelling point around cheap and easily available data. Further, this makes staking difficult, if not impossible, in such systems because now high-quality, curated data sources become outliers.

Also, here are distributions of some recent pre-aggregated results in a ETH-USD price feed. Which oracles are misbehaving or malicious? Which responses are worthy of a high “reputation score”? The answer is: we don’t have enough information to tell.

Code used to generate these plots can be found here.

Further, since all outlier detection techniques are ultimately a function of the spread of the data; even with an outlier detection mechanism in place, a malicious oracle can still skew responses and remain undetected. One can imagine a potential attacker abusing the “fog of randomness” — the expectation of a certain amount of error, randomness, and variation with real-valued samples — in order to attack the system, i.e. skew the aggregate result. This is loosely related to a recent body of work called stealth attacks in cyber-physical systems.

Further, one should be careful before considering such “minor” misreports as insignificant. Consider the case where ETH price is close to triggering a large liquidation — a slight skew upwards/downwards in the reported price may trigger a liquidation that causes a disproportionate effect.

The linearity and continuity in the mechanics of a system decides how sensitive it is to errors in input data.

There is also the much worse case where we have an attacker (or group of colluding attackers) that control a majority of nodes, in which case they can control the aggregate result, resulting in honest nodes having their stakes slashed.

Not to suggest it’s impossible to detect any malicious behaviour at response time. It’s just that there is a lot of nuance and complexity here, and getting nodes to arrive at a “consensus” regarding real-valued, continuous data is fundamentally different than arriving at consensus about a binary value (or a small set of discrete values) like consensus algorithms up until this point in computer science history.

A time-based framework for detection

We can consider two possibilities for detecting malicious/poor behaviour in an oracle-serviced data feed: we can detect such behaviour (1) in the present or (2) in the future. Or, more finely-grained, we can choose to detect malicious behaviour at some point t ≥ 0, where t = 0 is the special case mentioned above. (Note: here we briefly ignore some of the nuance regarding “time” on the blockchain; it’s not yet relevant for the high-level discussion here.)

For the reasons mentioned above — in particular the point that outlier ≠ malicious — we believe that detecting malicious oracle behaviour at t=0 is ineffective, except in a few constrained scenarios.

In order to accommodate all of the rich data types that can be brought on-chain, punishing malicious behaviour requires a bespoke approach that can only be done after the incident occurred. That is, we take the t>0 approach to detection.

Conclusion and next blog post

When we go from decentralized consensus about a few discrete values to some sort of decentralized consensus on continuous, real values, we make a quantum leap in complexity and we aren’t really dealing with the same problem anymore. Such a discontinuous leap necessitates an entirely new approach.

Next week we will discuss Quantifiable Security and what it means for data feeds to be insured.

[See the next article in our series here.]

Footnotes

[1]: https://www.peercoin.net/whitepapers/peercoin-paper.pdf

[2]: https://bitcoin.org/bitcoin.pdf

[3]: Note that non-malicious behaviour can sometimes lead to these splits in the blockchain as well, due to general asynchrony in the network.

--

--

--

API3 is leading the movement from legacy third-party oracle networks to first-party oracle solutions that deliver more security, efficiency, regulatory compliance, and simplicity. Learn more: api3.org.

Recommended from Medium

Blockchain: Real Estate Purchase Process

Discord Question Contest

Orion Protocol Partners and System Working

Personal Finance in ETH

Open Platform Project Review

New Machinie Route

Unique Network: AMA session with Alexander Mitrovich

Dive into Nebulas 4— Transactions

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Saša Milić

Saša Milić

crypto/blockchain researcher

More from Medium

How to do an ICO on XinFin Network in less than 20 minutes.

Zap Protocol implements a cross-chain bridge for Binance Smart Chain and Ethereum

CVI x Gate.io Russian AMA Recap

Tezos Community Rewards — April 2022 Winners