Is Bitcoin still the king? A deep dive into the recent crypto landscape (2018–2021)

Gabriel Dutra
Coinmonks
10 min readApr 20, 2022

--

This project will dive into crypto data to compare different coins and make an overall analysis of the current landscape of the crypto market.

image source: https://everydaycryptonews.com/top-5-cryptocurrencies-for-april/

Cryptocurrencies have been gaining a lot of popularity in recent years. From countries such as El Salvador adopting Bitcoin as an official currency, to new technologies like the Metaverse having its financial backbone based on Ethereum and Solana. This recent growth in usability led to many different currencies being created, and what was once a market ruled by Bitcoin, is now a diverse landscape with multiple coins available. In this article, I will investigate how relevant other cryptos are when compared to Bitcoin, and whether or not when can expect other coins to rule dominant in the near future.

To answer these questions, I will be utilizing the G-Research Crypto Dataset, available at:
https://www.kaggle.com/competitions/g-research-crypto-forecasting/data.

The data was gathered by the company G-Research, a financial research firm in the UK, and it was made publicly available on Kaggle through a machine learning competition, where teams would use the data to train ML models to predict crypto prices in the future. While the data is optimized for training models, I believe that it can be used to produce great visualizations about crypto and how different coins correlate to each other.

The main dataset (train.csv) contains 24,236,806 rows and 10 columns. Each row represents a minute timestamp of a certain coin (There are 14 coins in total). Each column contains information about that minute timestamp.

This dataset contains the closing and open prices of 14 different cryptocurrencies, including Bitcoin, Ethereum, Litecoin, Monero, and others. It also has information about how many coins and trades where made at a given minute, which will allow us to find how much money is moved by each currency and how much each coin is used.

I will create different visualizations in Python to analyze different aspects of all the available coins. This article will go through the python notebook that covers all the code used to extract relevant information from the data and plot it. All the figures were combined into a final slide that tells the story of the current crypto landscape.

The notebook can be found here:
https://colab.research.google.com/drive/1GocXzh6UpK8ORhIkBVoTzBPC31VIirjU?usp=sharing

Initial Setup

Here, we are simply importing the required libraries, as well as reading the data as a csv file through pandas.

import matplotlib.pyplot as plt
import squarify
import numpy as np
import pandas as pd
plt.style.use(‘fivethirtyeight’)
train = pd.read_csv(‘train.csv’)

The dataset contains 10 columns. The following is the description of each column as written on kaggle:

  • timestamp - A timestamp for the minute covered by the row.
  • Asset_ID - An ID code for the cryptoasset.
  • Count - The number of trades that took place this minute.
  • Open - The USD price at the beginning of the minute.
  • High - The highest USD price during the minute.
  • Low - The lowest USD price during the minute.
  • Close - The USD price at the end of the minute.
  • Volume - The number of cryptoasset units traded during the minute.
  • VWAP - The volume weighted average price for the minute.
  • Target - 15 minute residualized returns. See the 'Prediction and Evaluation' section of this notebook for details of how the target is calculated.

For this project, we will focus on the columns timestamp , Count , Open , Close , and Volume .

Notice that the data from all coins is combined together under the ‘train.csv’ file. To fix that, we have to assign a variable to the data corresponding to each asset ID. we can do that using the following:

#Assigns each variable to its respective asset
bin = train[train[“Asset_ID”]==0].set_index(“timestamp”) #Binance
btc = train[train[“Asset_ID”]==1].set_index(“timestamp”) #Bitcoin
btcC = train[train[“Asset_ID”]==2].set_index(“timestamp”) #BtcCash
ada = train[train[“Asset_ID”]==3].set_index(“timestamp”) #Cardano
doge = train[train[“Asset_ID”]==4].set_index(“timestamp”) #Dogecoin
eos = train[train[“Asset_ID”]==5].set_index(“timestamp”) #Eos
eth = train[train[“Asset_ID”]==6].set_index(“timestamp”) #Etherium
ethC = train[train[“Asset_ID”]==7].set_index(“timestamp”)#EthClassic
iota = train[train[“Asset_ID”]==8].set_index(“timestamp”) #iota
lite = train[train[“Asset_ID”]==9].set_index(“timestamp”) #Litecoin
mkr = train[train[“Asset_ID”]==10].set_index(“timestamp”) #Maker
mon = train[train[“Asset_ID”]==11].set_index(“timestamp”) #Monero
ste = train[train[“Asset_ID”]==12].set_index(“timestamp”) #Stellar
tron = train[train[“Asset_ID”]==13].set_index(“timestamp”)#Tron

We can then create a list that will hold all the assets, as well with a list for their names and abbreviations. This will be useful for creating the plots.

asset_list = [btc, eth, bin, btcC, ada, doge, eos, ethC, iota, lite, mkr, mon, ste, tron]asset_names = [“Bitcoin”, “Ethereum”, “Binance”, “Bitcoin Cash”, “Cardano”, “DodgeCoin”, “Eos”, “Ethereum Classic”, “Iota”, “LiteCoin”, “Maker”, “Monero”, “Stellar”, “Tron”]asset_names_abr = ["BTC", "Eth", "Bin", "BtcC", "Ada", "Doge","Eos", "EthC", "Iota", "Lite", "Mkr", "Mon", "Ste", "Tron"]

Data Visualization

With the setup out of the way, we can start creating some plots to analyze our data.

Analyzing the trend for all coins

A good idea is to see how similar the trends of all the coins are. We can get that by doing one line plot with all the assets. To do that, we first need to normalize their prices. The following function will handle that:

def normalize(data):
return (np.array(data/max(data), dtype=’float’))

We can now create a combined plot of all the coins using matplotlib:

f = plt.figure(figsize = (6,4))#Creates a list of colors with grey shading 
greys= plt.get_cmap(‘Greys’)
colors = iter(greys(np.linspace(0.2,0.5,13)))
#Creates arrays with the dates and its specific timestamp
#The date to timestamp calculator can be found at:
dates = [‘2018–01’, ‘2019–01’, ‘2020–01’, ‘2021–01’, ‘2021–09’]
ticks = [1514764860, 1546300800, 1577836800, 1609459200, 1632182400]
#Plots the normalized price trend of each coin
counter = 1
for i in asset_list[1:]:
plt.plot(i.index, normalize(i[‘Close’]), c = next(colors),
label = asset_names[counter])
counter += 1
plt.xticks(ticks, dates)
plt.plot(btc.index, normalize(btc[‘Close’]), c = ‘Orange’, label = “Bitcoin”)
plt.legend(bbox_to_anchor=(1.33, 1), loc=’upper right’, borderaxespad=0)
plt.xlabel(‘Time’)
plt.ylabel(‘Normalized price’)
plt.grid(False)
plt.show()

This should result in this figure:

Line plot of all the 14 coins available in the dataset

We can clearly see how their plots converge after 2020. More interestingly than that, it seems that many coins had their price plunge throughout 2018. This is probably due to the fact that most of these cryptos had their ‘public release’ around the beginning of 2018. My theory is that their prices started plunging after their hype faded away.

You can easily obtain individual line graphs for each asset by putting the last 9 lines of the code above inside the for loop. You will see how their trends are very similar.

Analyzing how much each coin was used

The ‘Count’ column can give us some valuable insight on how much each coin is being used, as this column gives us the count of trades at a given minute.
We can add its values to find how many trades happened per asset, and make a bar plot out of it.

The following code sums all the trade count values:

#Creates an array with the sum of all total trades for each asset
total_trades = []
for i in asset_list:
total_trades.append(np.sum(i[‘Count’]))

And below is how we can plot it:

fig = plt.figure(figsize = (6, 3))
ax = fig.add_axes([0, 0, 1, 1])
#Creates a list of colors to be used in the barplot
colors = [‘orange’, ‘lightblue’]
for i in range(0, 12):
colors.append(‘lightgray’)
#Plots the data
ax.bar(asset_names_abr, total_trades, color=colors)
plt.axhline(y=np.average(total_trades), label = ‘Average trade quantity’, c = ‘gray’, ls=’:’)
plt.ylabel(‘Total trades (in billions)’)
plt.xlabel(‘Asset’)
plt.legend()
plt.grid(False)
plt.show()

The code produces this picture:

Bar plot of the trade count of each asset

This plot is very straightforward, showing how Bitcoin and Ethereum were by far the most used coins.

Variance Analysis

We can easily manipulate the data to obtain the average variance of each coin on a given timeframe. I will be analyzing the average daily variance, as it is the most relevant on a short term investment perspective. For this, I will use the function below:

def get_average_variance(data, tf):
variance = []
for i in range(0, len(data)-tf, tf):
variance.append((abs(data.iloc[i, 2]
-data.iloc[i+tf, 2])
/data.iloc[i+tf, 2])*100)
return np.average(variance)#Gets daily average variance (in %) from all coins
asset_variance = []
for i in asset_list:
asset_variance.append(get_average_variance(i, 1440))

To change the timeframe, change the ‘tf’ parameter. Calling the function with 1440 as a parameter will give me the daily variance, as 1 day has 1440 minutes.

Finally, we can plot the data:

fig = plt.figure(figsize = (6, 3))
ax = fig.add_axes([0, 0, 1, 1])
#Creates color map for the plot
#Manually color the relevant assets
colors = [‘orange’, ‘lightgrey’]
for i in range(0, 12):
if i == 8:
colors.append(‘red’)
colors.append(‘lightgray’)
#Plots data
ax.bar(asset_names_abr, asset_variance, color=colors)
plt.axhline(y=np.average(asset_variance), label = ‘Average daily variance’, c = ‘gray’, ls=’:’)
plt.ylabel(‘Daily variance (in %)’)
plt.xlabel(‘Asset’)
plt.legend()
plt.grid(False)
plt.show()
Average daily variance per coin.

Another straightforward plot, showing how Bitcoin is the least volatile coin, while maker is the most volatile one. Additionally, the other assets all have a similar variance, which is expected, as they all have a similar price trend.

Total money traded

By multiplying the columns ‘Open’ and ‘Volume’, we can get how much money was moved on a given coin.

def get_total_transaction_values(data):
total_money = (np.sum( data[‘Open’] * data[‘Volume’]))
return total_money#Creates a list of money traded per coin
total_money = []
for i in asset_list:
total_money.append(get_total_transaction_values(i))
print(total_money)

If we sort the values of ‘total_money’, we will see that there was significantly more money moved through Bitcoin and Ethereum than all the other coins. To make our visualization clearer, we can group the other assets and plot only values:

total = []
total.append(total_money[0])
total.append(total_money[1])
total.append(sum(total_money[2:]))
#Plots area graph of the data
plt.rc('font', size=14)
squarify.plot(sizes=total,
label=['Bitcoin \n$3,03 trillion', 'Ethereum \n$1,77 trillion', 'Others \n$2,27 trillion'],
color = ['Orange', 'Lightblue', 'Lightgray'])
plt.axis('off')
plt.show()
Area plot of the total money moved through different crypto assets.

On the plot, we can see how Bitcoin alone moved more money than other coins, showing just how popular and important it still is.

Usage Growth from 2018 to 2020

For our last plot, I want to see how much each coin grew from 2018 to 2020. To do that, I will sum the ‘Count’ column in 2018 and in 2020, and convert their different into a percent value.

We can do that with the following:

def get_count_sum_2018(data):   #timestamp at 2019/01/01, 8:00:00 PM
df = data.loc[data.index < 1546390800]
return np.sum(df[‘Count’])def get_count_sum_2020(data): #timestamp at 2021/01/01, 8:00:00 PM between
#timestamp at 2019/01/01, 8:00:00 PM
df = data.loc[(data.index < 1609549200) & (data.index > 1546390800)]
return np.sum(df['Count'])
count_2018 = []
for i in asset_list:
count_2018.append(get_count_sum_2018(i))
count_2020 = []
for i in asset_list:
count_2020.append(get_count_sum_2020(i))
percent_difference = {}
for i in range(len(count_2020)):
percent_difference[asset_names[i]] = (((count_2020[i]-count_2018[i]) /count_2020[i])*100)

Notice how I’m using timestamps to define the range of the data. To get the corresponding timestamp value for a date, I can simply run a code similar to this

import time
timestamp = time.mktime(time.strptime('2019-01-01 20:00:00', '%Y-%m-%d %H:%M:%S'))

We may now plot the percent values

colors = []for i in range(14):
if i == 6:
colors.append(‘orange’)
elif i == 12 or i == 13:
colors.append(‘red’)
else:
colors.append(‘lightgray’)
#Sorts the values for plotting
percent_difference = dict(sorted(percent_difference.items(), key=lambda x:x[1]))
fig = plt.figure(figsize = (3, 8))
plt.barh(list(percent_difference.keys()), list(percent_difference.values()), color=colors)
plt.axvline(x=np.average(list(percent_difference.values())), label = ‘Average usage growth’, c = ‘gray’, ls=’:’)
plt.legend(bbox_to_anchor=(1, 1.05), loc=’upper right’, borderaxespad=0)
plt.grid(False)
plt.xlabel('Percent usage growth')
plt.ylabel('Asset')
plt.show()

This results in:

Here, we can see how Bitcoin is far from the coin with highest growth in usage. Both DogeCoin and Maker experienced a growth of almost 100% in usage. Nonetheless, all assets above had a respective growth above 50%.

Combining all plots

To conclude this project, we can save all plots into one slide, add some titles and some tweaks so that all plots tell one story. This is the final result:

Combined plot with added titles.

Design Choices

I tried to leverage my design to highlight only the important information on the screen, as well as trying to remove as much clutter as possible.
Bitcoin is easily associated with the color yellow, which is a recurring color throughout the combined plot. I decided to distinguish Ethereum as well, as it is a notable ‘competitor’ to Bitcoin, and definitely another outlier when compared to the other coins in the data. Moreover, I used red to highlight other outliers that aren’t Bitcoin or Ethereum, and light grey to display the remaining data.

When it comes to the titles, I made sure to color then accordingly, while also making any important information bold. Furthermore, both the title and subtitle make relevant conclusions about their respective plot.

Conclusion

When we analyze all plots together, it is clear that Bitcoin is still an important reference in the world of crypto, and so far, no other currency has been able to take its spot. Being by far the most used crypto and the most stable one, Bitcoin is still king, and for the looks of it, it will continue to be for the near future.

Join Coinmonks Telegram Channel and Youtube Channel learn about crypto trading and investing

Also, Read

--

--