Crypto portfolio optimization with Python and Tensorflow — Matrix calculus approach

Photo by Jonathan Mast on Unsplash

Breathe deep because today we are going to dive into the land of portfolio optimization. We will use fancy tools around the Python ecosystem, Financial Risk Modeling and a bit of Machine Learning to build a crypto portfolio optimizer.

Ready?

Tooling

If you are already not familiar with Python, do yourself a favor and follow a good tutorial to get up and running.

Although it is not strictly necessary, I recommend having a look at Jupyter Notebooks. They just bring the Read-Eval-Print Loop to a whole new level, by combining the capabilities of a text editor, an interactive console and a visualization dashboard.

Jupyter Notebook example (from Toyplot)

The downside is that the whole environment has to be installed locally and Python versions and some of its libraries may not always play well together.

Fortunately, Google introduced Colab in Google I/O 2018. It is an online flavor of Jupyter, integrated in Google Drive and featuring Tensorflow support for CPU and GPU execution!

Libraries

So let’s create a blank notebook and start with our optimizer. First, we need to import the following libraries:

import json
import requests
import pandas as pd
import numpy as np
import tensorflow as tf
# import matplotlib.pyplot as plt

json and requests are self-explanatory. pandas is very useful to work with data series and data analysis. numpy is the swiss army for scientific computing. We will mainly use it for working with matrices. tensorflow will be used to optimize the weights of our portfolio and matplotlib lets us visualize data on our notebook.

Fetching historic data

To start working with data, we need to retrieve it from an API endpoint. In our case, we will use CryptoCompare. Copy this code in the Colab and run it. Let’s get some data and arrange it:

Note that in addition of fetching and parsing JSON data, we are using Pandas to turn the time column into a Python date and index by it. This allows us to filter rows by any boolean criteria and deal with date conditions very easily.

In a REPL environment, variables are displayed by just typing them. In the code above, we populated the coin_history dictionary with Pandas data frames. So let’s add a new code block in Colab, type an expression like coin_history['BTC'] and hit Enter:

Et voilà, here is an excerp of the BTC history, nicely printed to our web console. Now that we are comfortable with Colab, let’s step into the financial yard.

Returns

Next, we need to add a couple of columns with values derived from the current columns: the (daily) return and the excess return.

The return of an asset is the ratio between the price variation over the initial price.

The excess return over a certain period is the difference between the average return and the actual return.

In a normal scenario, we would need to write a loop and compute values for every individual row. However, Pandas allows to perform operations with the whole array with one-liners like below:

Several things just happened here:

  • We added the return and excess_return columns to every row of our historic data frames
  • We stored the average return of every asset
  • We stored the cumulative return for every historic we have

Now, average_returns and cumulative_returns contain the following values:

Sweet. Now it’s time to combine the excess returns into the n × k Excess Return Matrix X, where n is the number of observations (days, weeks, months, …) and k is the number of assets in out portfolio.

Excess Return Matrix

Arranging the excess return columns into a matrix can be achieved in just a few lines of Python:

The first line creates an n × k matrix and the loops assign the corresponding values to each coin.

However, if we print the values of the matrix we will get the raw content of the numpy array, which is not very meaningful.

Let’s wrap the matrix into a Pandas DataFrame, set an appropriate index (dates) and column titles (coin names):

Easier to assess, isn’t it?

Risk Modeling

There are many approaches that can be used to optimize a portfolio. In the present article we will analyze the variance and covariance of individual assets in order to minimize the global risk.

To this end, we will use our Excess Return Matrix to compute the Variance-covariance Matrix Σ from it.

Variance-covariance Matrix

Computing it is as simple as can be:

Propperly displayed, it looks like:

In this matrix:

  • cov x, y is the covariance between asset X and asset Y
  • When x = y the value is the variance of the asset

Before we can jump into the actual portfolio optimization, our next target is the Correlation Matrix, where every item is defined like:

The correlation between assets X and Y is their covariance divided by the product of their standard deviations.

We already have cov(X, Y) stored in var_covar_matrix so we need a k × k matrix with the products of each standard deviation.

Let’s compute the individual standard deviations:

With NumPy, no sooner said than done. Here are the values:

To generate the matrix with the standard deviation products, we multiply the above by its transpose.

And here it is:

So now, we can finally compute the Correlation Matrix, as we defined before.

How does it look like?

In the Correlation Matrix:

  • The correlation of an asset’s returns with itself is always 1
  • Correlation values range from –1 to 1
  • Values tending to 1 mean that two random variables tend to have linear relationship
  • Correlation values tending to –1 (anticorrelation) mean that two assets tend to have opposite behaviors
  • Correlation values of 0 mean that two random variables are independent

Portfolio optimization

Given the average return and the variance of our assets, now it’s time to decide how much money is allocated in each one.

At this point, we would like to find a combination of investments that minimizes the global variance of the portfolio.

The weights array is the output we aim to get from our portfolio optimizer. The weight of every asset can range from 0 to 1, and the overall sum must be 1.

Given the weights array, we can define the weighted standard deviation as:

So the global variance of our portfolio can now be defined as:

Where W is a 1 × k matrix with the weighted standard deviations , C is the Correlation Matrix described above and the result is a 1 × 1 matrix with the global portfolio variance.

This is the value that we want to minimize, but how can we do it? 
We could define functions that computed the global variance for given weight arrays, explore all the possible candidates and rate them.

However, finding the absolute minimal value for an equation with k variables is an NP problem. The amount of calculations would grow exponentially with k if we attempted to evaluate every possible solution. Waiting 10³⁰ centuries to get an answer doesn’t look like an appealing scenario, does it?

So the best alternative in our hands is to use Machine Learning to explore a diverse subset of the search space for us, and let it explore variants of branches with potential to perform better than their siblings.

Tensorflow

This is where Tensorflow comes into play. TensorFlow is an opensource Machine Learning framework originally developed by Google.

The main difference with most libraries is that, instead of performing operations directly, it provides a set of methods to describe them, so that training and optimization can be performed on them. Operations like tf.multiply(x, y) will not return the actual result, but an operation that can be run in a session and evaluated, eventually.

Sessions are made of a graph of interrelated operations and variables, and they can be spawn into CPU’s, GPU’s and the specialized TPU units. This allows us to describe operations on variables (instead of numeric values) and let the framework minimize the target value for us.

To get started with Tensorflow, take a good tutorial, play with it and come back when you are ready.

So, let’s go back to Colab and code a first approach to the problem:

Let’s drill the function down, step by step.

The first block defines the mathematical operations that produce the global volatility. coin_weights is a k × 1 variable array with an equal amount allocated by default and from it, we get the weighted standard deviations array:

coin_weights = tf.Variable(np.full((len(coins), 1), 1.0 / len(coins)))
weighted_std_devs = tf.multiply(coin_weights, std_deviations)

With it, we can describe the matrix multiplication, in three steps:

product_1 = tf.transpose(weighted_std_devs)
product_2 = tf.matmul(product_1, correlation_matrix)
portfolio_variance = tf.matmul(product_2, weighted_std_devs)
portfolio_volatility = tf.sqrt(portfolio_variance)

Remember that these instructions are not computing anything yet. They create a graph of operations that will be executed when we run them in a session.

Running portfolio_volatility would trigger the dependent portfolio_variance tensor, which would use weighted_std_devs and trigger the dependent product_2 , etc.

Next, we initialize the variables and define what optimizer should be used to generate solutions in each training step:

init = tf.global_variables_initializer()
train_step = tf.train.GradientDescentOptimizer(learn_rate).minimize(portfolio_volatility)

Finally, we create a session, run the variable initializer operation and train the optimizer with the operation we just defined.

with tf.Session() as sess:
sess.run(init)
for i in range(steps):
sess.run(train_step)

Note that train_step depends on .minimize(portfolio_volatility) , which depends on the matrix multiplications we defined before. These operations will be evaluated at each step, right after the optimizer populates the coin_weights variable with new values.

So let’s check what we get:

No big surprise so far. We start with 1 / k (around 14%) of each asset allocated but the optimizer quickly learns that the way to minimize volatility is by… not allocating anything at all. Eventually the weights would converge to zero. Ta da!

We obviously have to force values to range from 0 to 1, and make their sum be equal to 1.

Constraints

In Tensorflow we can’t assign values to variables like var = val. Values can be declared via tf.constant(...) , as a tf.Variable(...) with a default value or be assigned via my_tensor.assign(...) , which creates an operation that needs to be run.

Let’s add the coin_weights constraints and run them after each training step:

Running zero_minimum_op triggers the assigment of 0 to coin_weights elements that are negative while unity_max_op does the same by replacing numbers above one, by 1. unity_sum_op ensures that the sum of all values equals 1.

Finally, constraints_op groups all three operations into one, and sess.run(constraints_op) triggers the group of operations.

Let’s see how well we do now.

Again, no surprise. USDT is a stablecoin, which is directly tied to the price of the US Dollar. It may have tiny fluctuations, but obviously the optimizer learns that the least volatility is achieved by putting all the eggs in the stablecoin basket.

Stablecoins allow investors to hold their money in a crypto wallet without the need to exchange into fiat currency. In scenarios of a bearish market, it obviously makes sense to sell into USDT and enter again when the trend reverses.

But if we look at cumulative_returns we can see that all the non-stablecoins have nice positive returns, and we don’t want to miss them. So, we just wrote our first portfolio optimizer, but for it to be useful we need to add returns into the equation, not just volatility.

Sharpe ratio

The Sharpe Ratio is one of the most used metrics in the Modern Portfolio Theory. It combines both magnitudes we want into a simple formula:

Where rp is the return of the portfolio, rf is the risk-free rate and sigma is the std deviation.

In crypto, the risk-free rate would correspond to the return of a stablecoin like USDT, but since its long-term average return is zero, our formula could be simplified into:

So let’s get back to Colab and describe the Sharpe Ratio so we can optimize it:

sharpe_ratio is now the target that we want to maximize. However, the optimizer we use only supports minimization, so we will achieve the same by minimizing its negative value:

Optimizer(...).minimize(tf.negative(sharpe_ratio))

Let’s also initialize coin_weights with random values instead of an equal share of our capital:

coin_weights = tf.Variable(tf.random_uniform((len(coins), 1), dtype=tf.float64))

Ready, let’s print the progress within the loop and see:

if i % 2000 == 0 : 
print("[round {:d}]".format(i))
# print("Coin weights", sess.run(coin_weights)) # if needed
print("Volatility: {:.2f} %".format(portfolio_volatility.eval() * 100))
print("Return {:.2f} %".format(sess.run(portfolio_return)*100))
print("Sharpe ratio", sess.run(sharpe_ratio))
print("")
# ...

That’s it! So given the behavior since mid-2017, the optimizer is telling us to invest ~80% in Stellar Lumens and spread the rest across Ethereum, Litecoin, Iotta, Ripple and USDT, given a past return of 6x and a volatility of 0.1%.

Now that we have the bare bones of our portfolio optimizer working, let’s leave it here for today and continue in part #2. There are many improvements to explore, and in the next episode I would like to cover:

  • Different optimizer algorithms, learning rates or iteration counts.
  • Better handling of the constraints
  • Allowing short trades
  • Keeping track of many portfolios
  • Chart and plotting artifacts
  • Playing with train and test timeframes

Feel free to play with the Python Notebook:

Wrap up

We just wrote a Cryptocurrency Portfolio Optimizer using Python, Tensorflow and Financial Risk Modeling. Writing an equivalent tool would have involved much time of research and development in the past.

The tool that we have explored allows to find optimal portfolios given the past performance of a set of assets. Machine Learning is not intented to find the absolute and exact answer for a problem. It is rather a very good tool to find close approximations in a very reasonable amount of time.

It is important to remember that a Portfolio Optimizer is just an assessment tool, not the “one true answer”. Every single result must be analyzed and interpreted by a human. We definitely need to check the price charts and market capitalization of each coin and do enough fundamental analysis (news, roadmap, team) before stepping into the ring.

Cryptocurrencies are wildly volatile compared to traditional securities. Do your own research and consider using this tool with more stable assets if your investment strategy is not aligned with such volatilities.

Investing in crypto involves substantial risk of loss. Past results are no indication of future performance.

Please, write the above disclaimer in a big banner and never forget to read it twice before you commit to an investment.

I know the article was deep and technical, but if you found it useful, please clap your hands 👏👏, comment below ✏️ and share your 💙💛 so I can make more people happy with new articles in the future 🙂🤞.