Bitcoin and financial markets: analysing relationships using Python — Part 1

Anna Grigoryeva-Trier
8 min readMar 9, 2022

--

Crypto markets, while still being only a fraction of global financial markets, have demonstrated a truly expansive growth in the last couple of years. Investing in cryptocurrencies became somewhat mainstream among retail investors in many countries. A growing number of institutional investors also reach out to crypto, to diversify their portfolios and benefit from speculatively high returns.

In this context, crypto assets can be considered on equal footing with other financial asset classes when designing a portfolio. Here, I apply data science tools to analyse performance and uncover potential relationships between crypto and traditional financial assets. In Part 1, I look at returns, volatility and correlations. In Part 2, I fit some models to the data.

Selecting data

As input for the analysis, I use price data from Yahoo Finance for the period starting Jan 1, 2020 to the time of writing. I have chosen Bitcoin (BTC-USD) and Ethereum’s Ether (ETH-USD) to represent the crypto markets as the biggest crypto assets by capitalisation to date.

I also use selected indices and ETFs characterising equity, fixed income and commodity markets:

  • S&P 500 (^GSPC) — The Standard & Poor’s 500 market index tracks the performance of 500 big companies trading on the US market. It is commonly used to represent the overall US equity market and as a benchmark for active investment strategies.
  • SSE Composite Index (000001.SS) is a market index tracking performance of stocks traded on Shanghai Stock Exchange. The majority of companies included in the index are Chinese.
  • Crude oil (CL=F) is the largest (and extremely volatile) commodity usually significantly affected by geopolitical, social and economic shocks both on the demand and supply sides.
  • Gold (GC=F) is another large commodity traditionally sought for during the times of financial turmoil.
  • To represent the performance of the technology sector I have selected Vanguard Information Technology ETF (VGT). It consists of equities in the electronics and computer industries and manufacturers of the latest tech. Most crypto currencies are built on blockchain technology, which in turn heavily relies on the latest computer technology.
  • Energy sector is represented by the Vanguard Energy Index Fund (VDE) which includes companies in the oil, gas, and coal sectors.
  • Fixed income asset class is represented with the iShares 20+ Year Treasury Bond ETF (TLT).
  • I have also included the Goldman Sachs Hedge Industry VIP ETF (GVIP) that tracks hedge funds performance. Hedge funds generally represent more active trading strategies and elaborate risk management policies, trying to outperform the market. Hedge funds are among the first financial market players turning to crypto to gain competitive advantages.
  • Finally, I include ETFs in the video gaming industry and cybersecurity sector — VanEck Vectors Video Gaming and eSports ETF (ESPO) and the First Trust NASDAQ Cybersecurity ETF (CIBR) correspondingly. These sectors’ developments are somewhat aligned with the developments in blockchain technologies. Many companies have demonstrated significant growth in recent years.

This list is not exhaustive and could potentially include many more assets, indices and ETFs. My main motivation was to cover major asset classes and to try to pinpoint specific sectors and assets that may have the strongest relationships with the crypto markets.

Python Code

I use the Python package pandas, function DataReader which allows extracting data from various Internet sources, including Yahoo Finance. I create a list of tickers corresponding to the required data. Next, I create a dataframe with the daily prices. I use the ‘Adj Close’ column as it is a price adjusted for splits and dividends distribution. To extract daily returns, I calculate the percentage change of the daily prices using the pct_change() method.

Plotting data

Visualisation is a powerful tool in data analysis. It allows one to get the ‘feeling’ of the data, discover if any modifications are needed, and select the best models.

Plotting daily returns results in a very busy plot, so I plot average daily returns for each month.

Fig.1. Average daily returns per month

Here, two periods of extreme returns volatility stand out. In April 2020, violent fluctuations of the oil price coincided with the beginning of the Covid-19 pandemic and OPEC price wars. Price shock was caused by the oversupply and rapidly dropping demand. The price of the US oil futures actually dropped to negative values, the lowest on record.

Second shock begins in February 2022 with the conflict in Ukraine. Visible decoupling of the returns of different asset classes, falling equity and bonds, and rising commodities (with the extreme jump of crude oil in particular) usually associated with the geopolitical uncertainties can be clearly observed.

To smooth the effects of these price shocks, I plot the monthly median returns which are less sensitive to outliers. I also limit the timeline up to February 23, 2022. While the behaviour of crypto and conventional assets at the time of global geopolitical shocks is a highly relevant and interesting topic, here I try to look at the relationships between crypto and traditional assets during the times of perceived geopolitical ‘normality’. Moreover, two different ‘regimes’ might confuse simple linear models trying to work with the time series.

Fig.2. Median daily returns monthly (1/1/2020–23/02/2022)

Plotted median returns point at significant volatility of oil prices in 2020, as well as high volatility of returns of Bitcoin and Ethereum, particularly during 2021.

Python Code

Converting daily return data to monthly data is easily done with the pandas resample() method, specifying required summary function, mean and median in this case.

To plot the data, I use matplotlib.pyplot package and plot() method.

Returns and volatility

Asset return and volatility (risk) are major factors of the portfolio construction. Most investors want to maximise their returns while limiting the risk.

Daily returns provide more data points for data analysis, however, annualised returns are usually preferred by investors and give a better basis for the comparison of the assets. To annualise daily returns of the assets, I calculate mean returns and multiply them by 253 (average number of trading days a year).

Common measure of return volatility, or risk, is the standard deviation. Standard deviation is calculated as a square root of the variance and has an advantage of having the same measure unit as the mean.

Using standard deviation as a measure of risk, we can calculate risk-adjusted performance measures, for instance Sharpe ratio. Sharpe Ratio is calculated as annualised returns divided by standard deviation. An asset with the higher Sharpe ratio is said to have better risk-adjusted performance. Note that Sharpe ratio penalises both positive and negative deviations while only negative ones are truly undesirable for investors. Still, it is a widely used measure in investment analysis due to its simplicity and intuitive interpretation.

During the analysed period, Bitcoin and Ethereum significantly outperformed conventional assets based on their Sharpe ratio. The best performers among traditional financial assets in the analysis were video games, cybersecurity and tech sector equities, helped by the adoption of the “Stay-at-home” Covid measures.

Python Code

Calculating mean returns and standard deviations is a straightforward task with corresponding pandas methods. Sharpe ratios can be calculated directly.

Looking for correlations

Correlations are the foundation of portfolio diversification. In short, the most diversified portfolio is the portfolio of uncorrelated assets. Highly correlated assets tend to move in the same or opposite direction. For instance, if one goes up the other one is also likely to grow, or if one goes up, the other one falls.

Correlation is usually measured with a correlation coefficient which can assume values from -1 to 1. Correlation coefficient values close to 1 point at positively correlated assets (returns tend to move in the same direction), to -1 — negatively correlated (returns tend to move in opposite directions), to 0 — uncorrelated assets.

Convenient way to visualise correlations is the heatmap.

Fig.3. Correlation matrix

Visual analysis of the correlation heatmap point at some noteworthy observations:

  • Two representatives of the crypto markets — Bitcoin and Ether — are highly positively correlated, which can be expected. Conventional assets most closely correlated to crypto assets include Tech, Hedge Funds, Video Games, and Cybersecurity.
  • Another highly positively correlated ‘cluster’ include Tech, Video Games, Cybersecurity sectors, Hedge Funds, and SP500. Three included sectors are closely linked through the demand side. High correlation with the Hedge Funds might be explained with heavy investing of the latter into these sectors, and SP500 is driven by the tech companies to a large extent. Notably, the correlation of these sectors with SSE Composite is much lower.
  • Treasury Bonds are mostly negatively correlated with all other assets, except gold.

Scatter plots are another common tool to visualise correlations between variables. They are used to detect possible linear relationships. Here, I use Bitcoin as one of the variables in each scatter plot.

Fig.4. Scatter plots

The scatter plots point at the existence of a linear relationship between Bitcoin and Ether. Interestingly, the correlation seems to be higher in the area of negative returns, which points towards co-skewness, i.e. higher correlation in tail scenarios than during normal market circumstances.

Further, possible relationships can be suspected between Bitcoin and Hedge Funds, Bitcoin and Video Games, and Bitcoin and Tech. The scatter plot ‘Bitcoin and Bitcoin’ illustrates 2 assets with the correlation coefficient of 1.

Python Code

Correlation matrix can be calculated in Python using the pandas method corr(). Heatmap can be plotted using the heatmap() function of the seaborn package.

Scatter plots are made using the subplots() function and scatter() method of the matplotlib.pyplot.

Continue to Part 2.

--

--