Ethereum, Google Cloud, and iExec: The Infrastructure for Decentralized Data-Driven Applications
In this blog post, I’ll briefly demonstrate how a specific smart contract platform (Ethereum) can interoperate with Google Cloud (BigQuery) via confidential computing middleware (iExec). At a high level, Ethereum Dapps (i.e. smart contract applications) run tasks on iExec, which in turn retrieves data from BigQuery and the data is processed in some way. Any one of BigQuery’s public datasets can be accessed in a decentralized way, the data can then be processed by an iExec worker in a worker pool, and a smart contract can be updated with the result.
If you’d just like to play with the demo instead of reading the technical ramblings of someone with very little sleep, here is the URL:
The Dapp is straightforward since it’s coded in Python. It uses the Google Cloud Library and takes up to 10 user arguments. The input gets filtered to prevent any errors and then BigQuery is queried with a custom SQL statement based on the input. The result is saved as a CSV file and is uploaded to IPFS. The IPFS URI is then returned on-chain for the user to access and view. While it’s possible to encrypt the results for the requester, no encryption is needed in this demo.
One unique thing about this Dapp is that it needs to run within a trusted execution environment (TEE) in order to read an encrypted API key that grants access to a BigQuery dataset in a secure way. You may be wondering how can you share something as secret as an API key in a public decentralized way. The short answer is with Intel SGX.
An iExec dataset is just another term for an encrypted ZIP file. In this case, a Google Cloud API key with read-only access to a BigQuery private dataset is encrypted then uploaded online to any file hosting service. The key to the encrypted file is transferred to the iExec Secret Management Service (SMS) via a Transport Layer Security (TSL) channel. When a requester (you) sends an on-chain order that triggers the secure application to start, the secret of the iExec dataset provider (me) is written into a temporary session and sent over TLS to a dedicated Intel SGX Attestation Service enclave responsible for communicating with the final application enclave within an iExec worker with an Intel processor with SGX in a TEE worker pool that executes the task. If the application enclave is proven to be legitimate it will receive the secret or the decryption key to decrypt the ZIP file that holds the Google Cloud API key.
Technically the entire historical coin price dataset could be stored as the iExec dataset but that may require constant maintenance and possibly massive files being uploaded every update. By using BigQuery as a dataset provider the entire process is streamlined for the least amount of friction by simply using a tiny API key as the iExec encrypted dataset. Since the dataset is on the iExec marketplace, the creator can set any price to use it and can limit which Dapps or worker pools have access to it.
The BigQuery private dataset is first populated with historical coin data directly from CoinMarketCap’s API then updated every day with new data.
The neat thing about BigQuery is that all customers get 10 GB of storage and up to 1 TB worth of queries a month, completely free of charge. This works out to be around 100,000 free queries a month for this Dapp. Every result also includes a receipt that's generated within the Dapp that gives a readout of the query details. If you’re confused about what BigQuery is, just think of it as a massive database of general data you can access online if you have the right key.
Probably the most work out of everything, this front-end was mainly written using D3.js to build the charts and tables. Without the front-end, I could've just linked to the IPFS upload but that would be boring.
The bar chart race is pretty straightforward. Click play and watch the race unfold. The gains chart is a simplified version of an analysis tool I wrote to compare the returns of various coins. The blue date/line represents the past, the white is the present, and the future is green. Clicking anywhere on the chart moves the past to the present. Within every coin info box, you’ll see the current slider price in white, the gains from the past date to the present in blue, and the gains from the present to the future date in green. Just move the white bar around and you’ll get a general idea of how it works.
By leveraging BigQuery with current iExec technology, anyone can easily update an on-chain smart contract with off-chain data in a secure way. This interoperability technique that iExec facilitates may lead developers to create hybrid applications that take the best of what smart contract platforms and decentralized cloud computing platforms have to offer. There are over 200 free BigQuery datasets with terabytes of data just begging to be analyzed and deployed in use cases ranging from off-chain ML training to on-chain prediction marketplaces. The good news for people looking into using BigQuery for off-chain confidential cloud computing, but are concerned about high Ethereum gas prices and are waiting for layer 2 scaling, is that iExec has its own sidechain with no gas fees.
Hopefully, this basic demonstration opens your imagination to a magnitude of possibilities that may change the way people interact with data-driven smart contracts on Ethereum. This is decentralized and privacy preserving technology that is currently deployed on Ethereum mainnet and ready to be used by enterprises.
Note: I received no compensation from any project or company mentioned in this blog post for this demonstration. Everything was coded and prepared in my free time. All information is for educational purposes only.