Learn To Code By Calculating Bitcoin’s Correlation to Stocks & Gold

Get started with Python — one of the easiest to learn and most powerful programming languages.

David (Dudu) Azaraf
Coinmonks
Published in
11 min readJan 25, 2021

--

If you’re like me, you’re probably sitting at home in the freezing cold contemplating how whether or not you actually had a social life pre-Corona or perhaps it was all an illusion. Instead of watching Netflix and baking yet-another sourdough loaf, why not use this as an opportunity to learn useful skills that can be used to turbo-charge your career?

Learning how to write code is one skill that could come in very useful across a number of disciplines. Writing code is not only for software engineers. Whether you work in marketing, sales, analysis or any other role, knowing how to write code can give you a much-needed edge in a crowded marketplace. It changed my life, and hopefully, it can do the same to yours.

In this article, we’ll learn the basics of Python by calculating Bitcoin’s correlation to the stock market and gold. For those who live a TL;DR kind of life, head to the end of the article for instructions on how to run the script straight up.

On to the code we go.

By the end of this article, you should be able to:

  1. Install and set up your Python work environment.
  2. Understand what the correlation between two assets means for your portfolio.
  3. Download historical data for Bitcoin, the S&P500 and Gold.
  4. Prepare the data for analysis using Pandas — a powerful package for all things data.
  5. And finally, calculate the correlation between the S&P500 and Bitcoin and interpret the results.

Basic programming skills introduced in this tutorial include:

  • Installing packages using pip and importing them into your program
  • Writing and using functions
  • Working with various data types (ordinary variables, lists, dictionaries, Pandas Dataframes)
  • Working with APIs

Install and set up your Python work environment

There are many ways to download and set up Python, but the most user-friendly way is to use Anaconda. By installing Anaconda, you get Python plus a whole lot of useful packages and programs.

You can find an Anaconda installer for your operating system over here.

While you wait for Anaconda to download, enjoy some hilarious deep fake banter courtesy of Sassy Justice.

Packages: Make your life easier by allowing you to perform a whole lot of complex tasks with a few easy commands.

IDE or Integrated Development Environment. This is the editor we’ll use to write our code.

Once Anaconda has finished downloading, open up Spyder. This is where the magic happens. What makes Spyder special, especially for beginners, is the variable explorer on the top right. It makes it easy to follow along with your code and see how each line changes the variables you’re using.

Next up, it’s time to install one more package that we’ll need during this tutorial. Most of the packages we’ll be using come pre-packaged with Ananconda, except for one — yfinance. We’ll be using yfinance to download stock data later, but for now we just need to install it.

We do so via the command line, or the terminal if you’re a Mac user. If you’re unfamiliar with the command line, think of it as a way to move around your computer and do things with text instead of a mouse. In our case, we’ll be typing in one command only.

The first word in the sentence, ‘pip’, is known as the ‘command’ and it is followed by a number of arguments and sub-arguments. Pip is the Pythonic way of installing packages. Anytime you’re writing Python code and you need a package that isn’t currently installed on your computer, use pip. (There are other ways to install packages, such as conda, but covering the ins and out of conda vs pip is beyond our scope.)

To learn more about the command line, head over to codeacademy’s excellent tutorial here.

Understand what the correlation between Bitcoin and stocks/gold means

We interrupt your programming (pun very much intended) to bring you some finance background about what you’re about to do.

If there’s one thing most people know about investing, it’s that you shouldn’t put all your eggs in one basket. By diversifying across asset classes, you can lower the risk of your entire portfolio. There is even a theory about this, called Modern Portfolio Theory. According to MPT, however, diversification only works to reduce risk on one important condition.

The assets must be uncorrelated with each other. Correlation between two assets refers to the extent that a move in Asset A influences the price of Asset B. In order to reap the fruits of diversification, the correlations between assets in a portfolio need to be significantly low so that movements in the price of Asset A do not have much effect on Asset B.

When it comes to Bitcoin, whether or not it is uncorrelated to traditional assets like stocks and gold will determine if it makes it an attractive component in any portfolio on the basis of MPT.

Correlation values are between -1 and 1. A correlation coefficient of 1 between 2 assets means that they move in perfect lockstep, while a coefficient of -1 means the two assets are negatively correlated. Any numbers between -0.2 and 0.2 means very little correlation between the two assets

So is Bitcoin correlated to stocks and gold? Stick with the program and work it out for yourself!

Download historical data for Bitcoin, the S&P500 and Gold

Before we get down to the meat of our script, let’s first hook ourselves up with an API key for Cryptocompare.

For the moment, the free version available here will suffice. Go ahead and sign up and activate your account. Once you’ve activated your account click on ‘Create an API key’, select any of the options and give your key a name.

Viola, you’re the proud owner of an API key. We’ll come back to this a little later

Without further ado, let’s go get our data. This is the part where we write some ACTUAL Python code. Get excited!

Firstly, we need to import the packages we’ll be using.

Throughout the course and your life, when you want to actually run the code all you have to do is click F5

Let’s look at some of the little tricks we’ve used here.

  1. Creating an alias eg Import pandas as pd: Throughout our code, whenever we need to use a package we explicitly call it. An alias gives a package a second name that can be easier to call. In particular, aliasing pandas as ‘pd’ is a common practice amongst Pythonsters.
  2. Importing a module, sub-package or function from within a larger package eg from urllib.request import urlopen: Sometimes we don’t want to import an entire package when we only need one function. Such is the case with urlopen — the function that we use to read from the internet. We import it from the request subpackage within the larger urllib package.

Notice how the code changes color after every hashtag symbol? That’s because, in Python hashtags are used to write comments in your code. Comments are helpful in explaining both to yourself and other programmers what the code does. When the program is run, the commented code is skipped over.

Next, we’ll pull historical price data for Bitcoin, stocks and gold. For the former, we’ll be making an API call using the key we created earlier. As for stocks and gold, the yfinance package makes our lives significantly easier.

Let’s talk about API requests for a second.

Take a look at the btc_url variable. We’ll break this down into components and generalize them to nearly all APIs.

  1. Root Endpoint, “https://min-api.cryptocompare.com/data/: This is the domain from where you’ll be requesting data.
  2. Path, “v2/histoday”: Most APIs have more than 1 type of data that you can access. In order to get what you want, you need to specify a path. Think of a path as an automatic answering machine that asks you to press 1 for English, 2 for French, 3 for Spanish and so on. In our case, we specify that we’re looking for daily historical data. The v2 indicates that this is part of Cryptocompare’s second API version. The path and the root endpoint together are simply referred to as the ‘API endpoint.’
  3. Query Parameters, everything after the ? sign: So we’ve entered our endpoint to search for daily historical data. But how can we specify what assets we want to search for, along with other conditions for our search? To do that, we use query parameters. In our case, we specify that we’re looking for the historical price of Bitcoin (fsym=BTC) priced in US dollars (tsym=USD) and limiting our search to the first 2,000 entries (limit=2000.) We use the & sign to join our query parameters together.

Data and bandwidth are scarce resources, and as such most APIs don’t let you pull information unless you sign up and authenticate yourself. In our case, we authenticate ourselves using a query parameter in our API calls. Other APIs may have other authentication methods.

Prepare the data for analysis using Pandas

In the previous section, we used the Cryptocompare API to download historical price data for Bitcoin, and the yfinance package to get data about the S&P500. However, before we can calculate the correlation between them, we need to transform the data from the format we received into one that is easy to analyze.

This is when we bring out the big guns, or the big Pandas. Pandas is a data science library that gives you a much wider range of options when doing analysis than is natively available on Python. During the course of this section, we’ll work with Pandas Dataframes — a unique data structure that gives you maximum flexibility with your data. I like to think of it as Microsoft Excel for the Big Bois & Girls.

While most of the code in this section is pretty straightforward, let’s take a look at line 16. In this line, we want to convert the timestamp of each row from the UTC format into a human-readable date. We do that using the strftime function. But how do we apply that function to every single date in the dataframe. To do that we use lambda functions, anonymous functions that, when used together with the apply function, can do calculations and transformations across an entire dataframe.

At the end of this section, we have 3 pandas dataframe in the same format, one each for Bitcoin, S&P500 and stocks. Each dataframe has two columns, the date (formatted as 2021–01–20) and the price. In the next section, we’ll write a function to combine dataframes together and calculate the correlation between different assets.

Notice the variable explorer showing a 2-column dataframe for each of our assets. Also, be aware that the S&P500 and Gold aren’t traded on weekends like Bitcoin, we’ll have to take this into account in the next step.

We’re almost there, let’s finish off strong.

Calculate the Correlation Between Bitcoin, S&P500 & Gold, and Interpret the Results

t’s time to grow up. Up until now, we’ve been writing code that albeit functional, isn’t entirely efficient. Think about the fact that we had to format the dates for each one of our dataframes separately.

One way to escape repetition and level up our code is through functions. Think of a function as a mini-program that takes some inputs, runs it through calculations, and finally returns some output. Our calculate_correlation function takes as input two dataframes, one for each asset we want to compare, and calculates the correlation between them.

A couple of interesting things are happening in line3.

  • We used the .iloc[] function to select the second and third columns in the joint_df dataframe. Selection is a critically important part of working with data in Pandas, and the iloc function rules supreme.
  • We calculated the correlation using the .corr() function over a rolling 30-day period

Well well, what do we have here?

On the 21st of Jan 2021, the initial publication date of this piece:

The correlation between Bitcoin and stocks is 0.23
The correlation between Bitcoin and gold is -0.25
The correlation between stocks and gold is -0.13

So it seems like there is a slight correlation between Bitcoin and stocks (positive) and Bitcoin and gold (negative.) Bear in mind that this value tends to wax and wane with market movements. In fact, just a month ago the correlation between Bitcoin and gold was a measly 0.04 for example.

Poor gold. It’s had a down month while Bitcoin and stocks continue to climb, as seen by the negative correlation it has to both Bitcoin and stocks.

Note on learning to code: This by no means aims to be a comprehensive Python resource. Rather, it is a useful starting point for anyone curious about coding that wants a push to get started. Learning through doing is the most effective way to learn how to code, and by the end of the tutorial you’ll have a valuable tool at your disposal that can help you make investments.

If you want to take the next step towards Python mastery, I recommend getting your start on codeacademy.

TL;DR But I Wanna Run The Code Section

So you’re the instant gratification type and you need your script now? Me too!

Here’s what you gotta do to run the script yourself and calculate the correlation between assets straight away.

  1. Download Anaconda — the Python suite that’s been referred to as the ‘Thor Hammer’ for all things data.
  2. Open your command line/terminal and type in pip install yfinance
  3. Download the full code for the script over here.
  4. Open the script using Spyder — the Python IDE (Integrated Development Environment) that comes with Anaconda.
  5. Click F5 to run the script.
  6. Head to the variable explorer and click on the correlation_btc_spy variable. This will give you the correlation between Bitcoin & stocks over time. Then click on the correlation_btc_gold variable to get the correlation between Bitcoin & gold.
  7. Go back and read the ENTIRE tutorial so that you’ll be able to adapt the code and find the correlation between any two assets.
  8. Drink beer, you deserve one.

Destroying Your Developer Envy

I am not a developer. Throughout my career in tech, that point has been made obviously clear to me. Developers, especially ‘ninja developers’ who can master numerous coding disciplines at once, usually sit at the top of the totem pole in any tech company. Additionally, whenever I’ve gone looking for a job there seemed to be 5 developer-related positions for every 1 non-technical gig.

You can see why my developer envy used to be quite strong and I’m sure there are many of you out there that resonate with the frustration of being second-class citizens in your respective fields.

It was through projects like this that I was able to overcome my developer envy and position myself more competitively in today’s saturated job market.

I’m STILL not a developer, nor am I likely to ever be a developer. My skillset and passions lie elsewhere.

But by learning to code I was able to imbue my content writing, business analytics and research capabilities with an edge that many others may not have. Automating your processes through code will free you up to extract even more value and make even more impact with the hours in your day.

I hope to keep up this educational series as a way of empowering both myself and the next wave of technically literate professionals.

--

--

David (Dudu) Azaraf
Coinmonks

Crypto chassid musing on Torah, technology and the intersection that lies between. Ancient wisdom📜 for a futuristic generation 🤖.