Blockchain Oracles: A Developer’s Perspective

Published in

The Witnet Oracle Blog

8 min readNov 14, 2023

Reading real world data from the Internet at any given time is something that cannot be natively done from a Smart Contract. This inability of smart-contract capable blockchains to access external data, and doing so in a reliable and trustworthy manner, has been for long referred in the blockchain literature as the “Oracle Problem”.

Although easily stated, the Oracle Problem has proven to have neither a simple nor a unique solution. Indeed, many different approaches have been developed during the past few years. These solutions mainly differ on how they tackle, or prioritize, factors like the nature of the external data to be retrieved, the actual usage of such data, or the level of trust that the end-users will have to commit.

There are also other factors and technical capabilities that differentiate oracle solutions, that may well not be easily understood by non savvy users, but still be of great relevance to smart contract developers, and that will be explained in more detail further on in this article.

So, developers landing into the Web3 industry, decentralized blockchains and smart contracts may find it quite obscure to understand the differences and subtleties among all existing Oracle solutions nowadays.

And while some developers may just recklessly buy into the hype and adopt the first solution that comes into their hands, others may rather understand which oracle technology may better suit their requirements in terms of trustability, liveness, finality, programmability and even long-term operational costs.

A simplified taxonomy

From a Smart Contracts Developer point of view, there are multiple layers of trust involved when determining whether any external data brought on-chain from the outer world is to be considered “valid”, or at least “good enough” for the circumstances and the time window under which some contract will utilize it:

Trustworthiness of the data producer: the entity that captures data from the real world.
Trustworthiness of the data publisher: the entity that publishes the data on the Internet.
Trustworthiness of the data reporter: the entity that retrieves some data from the Internet at a certain moment in time, potentially transforms the data according to some given rules, and that ultimately pushes the result on-chain into some smart blockchain context.
Reliability of the data integrity protocol: many solutions rely on some sort of off-chain consensus mechanism devised for either detecting, mitigating or even preventing the chances for the reported data to get eventually corrupted or censored.
Reliability of the blockchain consensus protocol: the one where the recipient smart contract thrives to live.

“Trustworthiness” of either physical or legal personas refers to their quality to be relied on as honest or truthful to their purpose (i.e. produce, publish or report data from the real world). “Reliability” of processes, or protocols, refers to their quality to consistently behave in certain expected way.

In general terms, it can be said that the trustworthiness of any smart contract that relies on data from the real world will be at most the product of the five quality factors aforementioned.

So a smart contract relying on a single source would normally be less trustworthy than one relying on multiple sources. But it wouldn’t really matter how trustworthy those sources were supposed to be, if the underlying data integrity protocol (i.e. the oracle solution being used), doesn’t implement reliable mechanisms to protect against faking or reluctant reporters. Or if the underlying blockchain is sustained by some few permissioned nodes.

First-Party Oracles

Oracle solutions in which the same entity plays both the role of data publisher and data reporter are known to be “First-Party Oracles”. In these solutions, the publisher also plays sometimes the role of data producer, but not necessarily. The data integrity protocol normally relies directly on the native consensus protocol of the smart contract chain where the external data is to be reported. All that needs to be checked by any on-chain data consumer is whether the data was reported from some pre-authorized externally owned address.

Some FPO solutions, like API3 or Pyth, mainly provide the middleware and infrastructure required for entities (like eventually you) to become First-Party Oracles in whatever number of smart blockchains they decide to do so. They also provide some standardized interfaces enabling smart contracts developers to retrieve data that’s been allegedly reported by any of these data producing entities.

Should there exist a FPO providing exactly the data that your smart contract needs, and you are certain that your potential user base will have no objection in trusting this FPO, congratulations! You found the perfect solution for your use-case!

In most use-case scenarios, though, you may encounter several major drawbacks in FPO-based solutions:

Most probably, you won’t find the FPO providing exactly the data that your application requires.
In some cases you may find an FPO providing the piece of data from which to infer the one actually required by your smart contract, but in a format that will force you to accomplish complex and expensive on-chain computations.
When no suitable FPO is found, you can always turn yourself into a First-Party Oracle, by running and maintaining your own off-chain infrastructure, as long as your users trust not only on the data you provide, but that you have the resources as to keep reporting updated data in a prompt manner and in the long term.

Third-Party Oracles

Third-Party Oracle solutions are those where the identities of the data publisher and the data reporter differ. In TPO solutions, the data consensus protocol usually takes place outside the execution environment of the smart blockchain where the data is expected to be reported.

For some applications, it is not about a TPO solution reporting some “objective” data value that’s published on the Internet at a particular moment in time, but some “subjective” interpretation performed by a limited population of third parties (e.g. potentially, humans or even AI-driven bots).

In other oracle solutions, the data being retrieved can be actually referred to public URLs (i.e. objective), but the consensus protocol may still require the intervention of humans providing judgement on the legitimacy of some previously reported value. In both later cases, these solutions are known to be “Subjective Third-Party Oracles”.

Oracle solutions where the reported data can be both derived from the source and reported into the context of a smart contract, according to some given rules and with no intervention or judgement from human-alike entities, are instead known as “Objective Third-Party Oracles”.

Subjective Third-Party Oracle Solutions

STPO solutions can be the only and true solution to singular application domains, like on-chain resolution of disputes based on digital facts that could be prone to multiple interpretations. The Kleros framework is a clear example for supporting this sort of use cases.

Other design approaches enable the application builder to publish somewhere (either on-chain or off-chain) a description of the data that the smart contracts will need, but not actually where to fetch it from. Potential “reporters” would eventually accept the challenge of periodically reporting such data in exchange for some token fee.

In most oracle solutions of this kind, reporters can get eventually penalized if reporting fake data, but usually not if they refrain from reporting any data at all, or if they fail to do so for whatever reason. Examples of this kind of STPOs would be Tellor, UMA, DIA, among many others.

The major drawbacks of STPOs are known to be:

Time-to-market: it may take some time since a new data description is conceived and a reliable population of active and committed reporters is “grown”.
Time-to-finality: in the order of days if data provenance is disputed.
Liveness of data provisioning: there may be no actual guarantees that enlisted requesters will keep providing customized data.
Subjectivity of “truth”: as it can ultimately depend on flawed human judgement, or misinterpretation.
PUSH-only solutions (in most cases): meaning that (a) your smart contract must trust and rely on off-chain workflows that are in charge of reporting data updates according to some established rules that most probably you won’t be able to control, and yet (b) your smart contract will have no means to PULL a new data update of its own accord.

Objective Third-Party Oracle Solutions

OTPO solutions are more versatile in terms of allowing application builders decide on the actual sources from where to derive data from, and on how to combine together data when read from multiple sources.

Data retrievals in OTPOs are deterministically specified as a combination of HTTP requests over public URLs, mainly referring to public REST API endpoints, or digital assets published on the WWW. Some OTPOs also allow to rely on private or premium data providers, in detriment of some level of centralization at the data integrity layer.

Generally speaking, almost all OTPOs can provide unpredictable and non-malleable randomness to smart contracts.

But there are three main features that not all Third-Party Oracles provide, the lack of which could turn out to be quite an impediment for an increasing number of Web3 application domains:

The ability to pull data updates based on smart contracts rules, named by some as PULL/PUSH oracles (as opposed to PUSH-only oracles that don’t support this feature). An example of this could be an ERC-721 contract willing to verify that some WWW asset actually exists before some new NFT gets minted.
The ability to parameterize data sources, where data providers remain the same but the specs of the actual data to retrieve are determined by smart contracts rules. An example would be that of a smart contract needing to verify the status of some pull request in some given Github repository (i.e. the parameterized element here would be the URL that contains the ID of a PR that is only known in runtime, and hence the smart contract that is requesting the data needs to be able to interpolate that variable into the request).
On-chain data traceability: enabling your smart contract to provide probabilistic certainty about the integrity of the data (on where did data actually come from at the time when it got reported, the off-chain computations applied to it at that moment, or both).

There are many other differentiating traits that may, or may not, be relevant from the perspective of a smart contract developer when choosing the best oracle solution to adopt. Hoping it helps in the decision making, the following table thoroughly compares the most relevant Third-Party Objective Oracles according to DefiLlama Oracles Observatory (as of November, 2023):

Airtable | Everyone's app platform

Airtable is a low-code platform for building collaborative apps. Customize your workflow, collaborate, and achieve…

airtable.com

In mere terms of “trustability”, the following chart summarizes the linked table above:

Summary and final tips

When building any sort of “decentralized” application that is to rely on data from the real world, and therefore deciding on which Blockchain Oracle solution better suits your goals, there are basically three major questions to solve:

What are the most trustworthy sources that can provide all required data?
Do you need your smart contract to initiate the request for data updates?
In whom will users be obliged to trust when using your smart contracts?

At the end of day, for an ever increasing population of crypto natives, it is this last question that should matter the most. Don’t you think?