Nansen.ai
Published in

Nansen.ai

Ethereum 2.0 ETL and Medalla Data in Google BigQuery

In this article we’ll cover our Ethereum 2.0 ETL tools for exporting Ethereum 2.0 blockchain data, the public Medalla dataset in BigQuery, and a Nansen.ai dashboard we built with some interesting charts and tables.

The article is broken down into three parts:

  1. A quickstart guide for Ethereum 2.0 ETL. The tools allow you to export beacon blocks, attestations, deposits, slashings, voluntary exits, validators, and validator committees.
  2. Some sample queries for BigQuery.
  3. Visualisations in Nansen.

Now, let’s go through the details.

Ethereum 2.0 ETL Quickstart Guide

The easiest way to get started is to request access to a Medalla node on Infura. Scroll to the bottom of this page until you see “apply now and register your interest” link. Click on that link and fill out the form to request access.

Install ethereum2-etl:

pip install ethereum2-etl

Export beacon blocks, attestations, deposits, proposer slashings, attester slashings, and voluntary exits:

ethereum2etl export_beacon_blocks --start-block 0 --end-block 200 \
--output-dir output --output-format json --rate-limit 5 \
--provider-uri https://projectid:secret@medalla.infura.io

Make sure to replace projectid and secret with your Infura project id and secret. The output will be written to the ./output directory.

For the--output-format option you can use either json or csv. JSON output is useful for data storages that support nested and repeated structures, such as Google BigQuery and AWS Athena.

You can find the json and csv schema in the documentation: JSON Schema, CSV Schema, Reference. Here is an example beacon_blocks.json schema:

Other commands — export_beacon_validators and export_beacon_committees — allow you to export validators and committees respectively.

To schedule daily exports of the data and loads into Google BigQuery you can use our Ethereum 2.0 ETL Airflow repository. Or you can access our public datasets instead.

Medalla BigQuery Dataset

Continue reading this article on research.nansen.ai.

--

--

--

Nansen analyzes 50M+ labeled Ethereum wallets and their activity. So you can separate the signal from the noise in blockchain data.

Recommended from Medium

About Firefly

Understanding Ethereum

Waves Coin Price Predictions 2020 & Beyond

How Blockchain Technology Works

OpenSea FAQ

Smart contracts will make ransomware more profitable, part 1

Shroom Community Update #1

New finance project — Goldfinch

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Evgeny Medvedev

Evgeny Medvedev

Creator of https://github.com/blockchain-etl, Co-founder of https://d5.ai and https://nansen.ai, Google Cloud GDE, AWS Certified Solutions Architect

More from Medium

Triggering a Cloud Function at a specified time using Cloud Scheduler in GCP.

Building a Data Lake on Google Cloud Platform

Mercado Libre Goes Big(Query), Plus Cloud Moneyball and Plugging a $300B Retail Search Black Hole

How can engineering interns contribute to a data science team?

Students in class coding on their computers