Ethereum 2.0 ETL and Medalla Data in Google BigQuery

Evgeny Medvedev
Nansen.ai
Published in
2 min readOct 16, 2020

--

In this article we’ll cover our Ethereum 2.0 ETL tools for exporting Ethereum 2.0 blockchain data, the public Medalla dataset in BigQuery, and a Nansen.ai dashboard we built with some interesting charts and tables.

The article is broken down into three parts:

  1. A quickstart guide for Ethereum 2.0 ETL. The tools allow you to export beacon blocks, attestations, deposits, slashings, voluntary exits, validators, and validator committees.
  2. Some sample queries for BigQuery.
  3. Visualisations in Nansen.

Now, let’s go through the details.

Ethereum 2.0 ETL Quickstart Guide

The easiest way to get started is to request access to a Medalla node on Infura. Scroll to the bottom of this page until you see “apply now and register your interest” link. Click on that link and fill out the form to request access.

Install ethereum2-etl:

pip install ethereum2-etl

Export beacon blocks, attestations, deposits, proposer slashings, attester slashings, and voluntary exits:

ethereum2etl export_beacon_blocks --start-block 0 --end-block 200 \
--output-dir output --output-format json --rate-limit 5 \
--provider-uri https://projectid:secret@medalla.infura.io

Make sure to replace projectid and secret with your Infura project id and secret. The output will be written to the ./output directory.

For the--output-format option you can use either json or csv. JSON output is useful for data storages that support nested and repeated structures, such as Google BigQuery and AWS Athena.

You can find the json and csv schema in the documentation: JSON Schema, CSV Schema, Reference. Here is an example beacon_blocks.json schema:

Other commands — export_beacon_validators and export_beacon_committees — allow you to export validators and committees respectively.

To schedule daily exports of the data and loads into Google BigQuery you can use our Ethereum 2.0 ETL Airflow repository. Or you can access our public datasets instead.

Medalla BigQuery Dataset

Continue reading this article on research.nansen.ai.

--

--