Indexing/dashboarding billions of bitcoin transactions with Elasticsearch and Kibana

Fabien Stepho
Jan 29, 2018 · 4 min read
Image for post
Image for post
Image for post
Image for post
https://www.object-ive.com

Bitcoin transactions have become very expensive since the end of the last year 2017, with fees rising to $55 as demand has increased considerably. So, I wanted to analyze what was happening in the bitcoin network.
This story describes how I proceed to analyze billions of bitcoin transactions.

The tools and frameworks I wanted to use for this, are all open source and include :

Bcoin — Javascript bitcoin library for node.js
Elasticsearch — RESTful, distributed search & analytics
Kibana — Tool to explore, visualize, and discover data
Docker — Open platform for developers and sysadmins to build, ship, and run distributed applications
Microsoft Azure — Cloud platform

Here are the following steps I came accross to accomplish my goal :

Step 1 — Installing the cloud infrastructure
Step 2 — Running and syncing a full bitcoin node with the complete blockchain data
Step 3— Extracting bitcoin data from the local blockchain
Step 4 — Indexing the data into Elasticsearch
Step 5 — Visualizing and analysing data

The first step was to install the platform. A virtual machine with at least 500GB of disk space is required for the need of blockchain data and elasticsearch index.

Here is an overview of the Azure cloud portal interface :

Image for post
Image for post

Docker is a must have for packaging and running isolated containers.
My docker compose definition consists mainly of the bitcoin node container, the elasticsearch container and the kibana container :

Docker compose definition file

As you can see, the data are not hosted inside the containers, but outside on the vm host, via docker “volumes”feature. So the generated data are not lost when containers are restarted or deleted.

Docker images and containers are constructed via the commande : docker-compose up. You can see running containers on the host vm via the command : docker ps

Image for post
Image for post
Running docker containers

I choosed to run the bcoin open source library.
The project is hosted on github : https://github.com/bcoin-org/bcoin. I tried first with Bitcore library, but I encountered performance issues while syncing.
The syncing of the blockchain data was a bit long and lasted several days, so I decided to switch with Bcoin library on wich I can develop code written in node js (step 3 and step 4).

Image for post
Image for post
Bitcoin node synchronizing data

Thanks to the bcoin API, I writed some pieces of codes to retrieve bitcoin transaction data from the full running node.
The API function I used is client.getBlock(height). Once I extracted the data, I pushed it to Elasticsearch index.

Node js Bcoin code

I created three indexes to handle the data. One for output transactions, one for input transactions and one for the main transactions. Each index contains fields like address, block height, block hash, fee, amount, coinbase, etc.

Image for post
Image for post
Elasticsearch indexes

The code which push the data into Elasticsearch use the javascript elasticsearch API with his bulk indexing feature. The bulk API allows me to make multiple index requests in a single step. This is particularly useful since I need to index a lot of transactions, which can be queued up and indexed in batches of thousands (100 000 in my case).

Here is the code for indexing :

Elasticsearch bulk indexation

Once runned, I waited several hours before I got 500 000 blocks totally indexed :

Image for post
Image for post
Bulk indexation processing result

Once the data was indexed, I plugged Kibana of the elastic stack, on the indexes :

Image for post
Image for post
Discovering transaction data with Kibana

I defined several visualizations and created a simple dashboard with metrics like “Total transaction fees”, “Number of unspent transaction outputs”, “miners repartition”, “Most popular addresses”

Image for post
Image for post
Kibana dashboard

With all these five steps done, I could now make things like :

  • Transaction monitoring,
  • Complex querying
  • Exploring anomalies with machine learning,
  • Analyzing transaction relationships with Graph,

My next story will be on these use cases, stay tuned !

Note : The source code is freely available on my github repository : https://github.com/fstepho/bcoin-es


Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store