Bitcoin transactions have become very expensive since the end of the last year 2017, with fees rising to $55 as demand has increased considerably. So, I wanted to analyze what was happening in the bitcoin network. This story describes how I proceed to analyze billions of bitcoin transactions.
The tools and frameworks I wanted to use for this, are all open source and include :
Here are the following steps I came accross to accomplish my goal :
Step 1 — Installing the cloud infrastructure Step 2 — Running and syncing a full bitcoin node with the complete blockchain data Step 3— Extracting bitcoin data from the local blockchain Step 4 — Indexing the data into Elasticsearch Step 5 — Visualizing and analysing data
Step 1 — Installing and defining the cloud infrastructure
The first step was to install the platform. A virtual machine with at least 500GB of disk space is required for the need of blockchain data and elasticsearch index.
Here is an overview of the Azure cloud portal interface :
Docker is a must have for packaging and running isolated containers. My docker compose definition consists mainly of the bitcoin node container, the elasticsearch container and the kibana container :
As you can see, the data are not hosted inside the containers, but outside on the vm host, via docker “volumes”feature. So the generated data are not lost when containers are restarted or deleted.
Docker images and containers are constructed via the commande : docker-compose up. You can see running containers on the host vm via the command : docker ps
Step 2 — Running and syncing a full bitcoin node with the complete blockchain data
I choosed to run the bcoin open source library. The project is hosted on github : https://github.com/bcoin-org/bcoin. I tried first with Bitcore library, but I encountered performance issues while syncing. The syncing of the blockchain data was a bit long and lasted several days, so I decided to switch with Bcoin library on wich I can develop code written in node js (step 3 and step 4).
Step 3 — Extracting bitcoin data from the local blockchain
Thanks to the bcoin API, I writed some pieces of codes to retrieve bitcoin transaction data from the full running node. The API function I used is client.getBlock(height). Once I extracted the data, I pushed it to Elasticsearch index.
Step 4 — Indexing the data into Elasticsearch
I created three indexes to handle the data. One for output transactions, one for input transactions and one for the main transactions. Each index contains fields like address, block height, block hash, fee, amount, coinbase, etc.
Here is the code for indexing :
Once runned, I waited several hours before I got 500 000 blocks totally indexed :
Step 5 — Visualizing and analysing data
Once the data was indexed, I plugged Kibana of the elastic stack, on the indexes :
I defined several visualizations and created a simple dashboard with metrics like “Total transaction fees”, “Number of unspent transaction outputs”, “miners repartition”, “Most popular addresses”
With all these five steps done, I could now make things like :
- Transaction monitoring,
- Complex querying
- Exploring anomalies with machine learning,
- Analyzing transaction relationships with Graph,
My next story will be on these use cases, stay tuned !
Note : The source code is freely available on my github repository : https://github.com/fstepho/bcoin-es
You can follow me on my twitter for cryptocurrency news :
Fabien Stepho (@fstepho) | Twitter
The latest Tweets from Fabien Stepho (@fstepho). IT Architect. Paris, France
You can follow me on my github for cryptocurrency development ideas :