Pairs Trading for Algorithmic Trading: Breakdown

Bart Chrzaszcz
Aug 27, 2017 · 18 min read

Hey there!

This is the first part of a series of my various experiments/attempts in implementing common algorithmic trading techniques over the next several months. I go over what I learned and how you can implement the algorithms yourself. In this part, I tried implementing a pairs trading algorithm myself and partially succeeded in that it makes money, but is extremely volatile and still needs a lot of work! I go over what algorithmic trading is briefly, what pairs trading is, and how you can implement it yourself on Quantopian and improve upon it. With my model and choice of pairs, you can get the following results over a 13 year period from 2004–2017 in Quantopian’s simulation.

Be aware that this algorithm is not perfect and has several drawbacks such as having a dangerous drawdown and a fairly high beta. This algorithm is helpful to introduce you to pairs trading, but not something I would enter a Quantopian competition with or use real money on. With a large drawdown, it means at over some span of time, you were down 66% when comparing your returns at its peak and returns at its trough. And with those types of losses, it’s very hard to tell yourself “my simulation said it will bounce back so I won’t pull the plug” before you lose it all. If you can get draw down to a more reasonable 15%, then it’s easier to make that judgment call.

What is Algorithmic Trading?

Algorithmic trading, in it’s most basic sense, consists of traders, sometimes called quants, who implement various complex mathematical models to trade various types of securities. While sometimes these algorithms are used by high frequency traders housed on extremely fast computers with direct internet connections to the exchange to lower latency, you can create algorithms that work on a longer term. Two common strategies used in this area include statistical arbitrage and mean reversion. They can be easily implemented by any individual.

Statistical Arbitrage

In this strategy, you try and take a look at 2 or more different securities and try to exploit their differences. For example, companies are able to decrease expenses by utilizing other companies’ cheaper goods or labour to increase profits through specialization.

Back to trading! One interesting strategy of arbitrage is called triangular arbitrage. In FOREX markets, you can expect all currencies to be generally equal. As in, for example, 1 USD = 2 CAD = 3 EUR. Thus, you can get 3 EUR for 1 USD. But this is not always the case; sometimes the global FOREX market is not fast enough to update all the prices if the ratio between the two of the three currencies changes. Here is a better example how this occurs:

Triangular arbitrage example. Source.

Mean Reversion

Given the following stock price history of AAPL over a single day:

You can see how a lot of time it goes up and down and up and down, but looking back over time, you can likely fit in a nice and smooth curve which would act like the average/mean price. What this technique exploits is that if the price gets way too high above the mean price at that point in time (i.e. the ceiling), it will be pressured by the market and revert back to the mean. You can say the same thing when the price gets too low and hits the floor. Hence this technique mainly tries to exploit the tendency for prices to revert back to a general curve known as the mean without it getting too far away too quickly. If interested, I recommend reading about Bollinger Bands which shares some similar characteristics:

Scalping

As an aside, this third technique is very hard to make a decent profit on today, but is very interesting nonetheless! Traders who employ this technique take one particular security and try and profit from the spread between the bid and ask prices. They open and close positions within minutes or even seconds. They try and get in right when a price change is happening and exit once that change is about to stop. I’m not going to focus too much on this technique since today it’s pretty hard to implement due to various market changes over the years and the fact that it requires utmost discipline and courage, but you can read more about it here:

Pairs Trading

This technique implements a lot of the concepts from both statistical arbitrage and mean reversion. As an overview, it takes two securities, determines whether they are cointegrated and correlated, and then makes trades when one of the securities doesn’t follow the movements of the other.

Cointegration & Correlation

It’s pretty easy to confuse the two terms through their raw textbook definition:

Correlated: when two securities move together in the same direction or opposite direction.

Cointegrated: when the distance between the pair doesn’t change drastically over time.

This can be easily seen in the graphical example below by Gekko Quant:

See how the two prices are moving together but the spread/gap between the two is growing.
See how the spread between the two is oscillating but the prices aren’t really moving together.

Mathematically, the two measures are combined into the technique of finding a stationary time series consisting of a linear combination of a pair of securities.

Source.

As you can see, a stationary time series in terms of trading consists of a security whose mean and variance don’t change much over time. However, note that stationary != mean reverting. A horizontal line can be considered a stationary time series but doesn’t revert back to the mean.

As seen in the Gekko Quant graphical examples, we can use the spread between the two securities to determine whether they are cointegrated and correlated.


Algorithm Implementation in Quantopian

There are various specific tests/measures that need to be used to determine whether you can make a pairs trade on them, but instead of explaining them above , I will explain them more deeply once we get to that chunk of code.

Imports

You’ve likely seen Numpy and Pandas before, and the Statsmodels library is for the various tests we’ll use to determine whether a pair’s spread creates a stationary time series.

Initialize

This function, which is part of Quantopian’s API, is called at the beginning right before any trading starts occurring. The context parameter is a Python dictionary where you can define all your different fields you would like to pass throughout your algorithm. For each possible pair you would like to test, you need to create the following entry into the context.asset_pairs list:

[
stock_1,
stock_2,
{
'in_short': False,
'in_long': False,
'spread': np.array([]),
'hedge_history': np.array([])
}
]

where stock_1 and stock_2 are calls to the symbol() or sid() function. The symbol() function takes the ticker of a company as a string while sid() is the unique security ID of the company which never changes throughout the life of a public security. Thus, since the same ticker can exist on multiple exchanges, it is best to use the SID.

  • The context.z_back tells us how many days where the market was open must pass before we can start referring to the Z-score of a pair. More specifically, in this case, in our spread Numpy array in each pair entry, it has to have at least 20 entries before we can start using the Z score. I’ll get to why this needs to happen and what the Z score means later.
  • Same can be said about context.hedge_lag which requires the hedge_history Numpy array to be at least 2 entries long.
  • Finally, context.z_entry says how large/small the Z-score needs to be before we consider opening a position on the pair.

The last three lines consist of functions that are specific to the Quantopian API:

  • schedule_function() lets you specify when you want one of your functions to execute. In this case, I want my_handle_data to execute every day 4 hours before the market closes for the day.
  • set_slippage() lets you determine how strongly your orders affect the market. For example, placing bids raises the price while placing asks lowers the price. It also determines whether your order can execute. For example, if you place an order for 100,000 shares but the company only has a volume of 50,000 per day, the chances the entire order will be filled are low. Or if you were to place an order 30 minutes before closing, there is a high probability your order won’t be totally filled (depending on the size). In my implementation, I just have the default slippage model Quantopian sets up for the VolumeShareSlippage where “the price you get is a function of your order size relative to the security’s actual traded volume”. I set the volume limit to be at most 2.5% of a minute’s trade volume and my price impact constant to be 0.1. The formula used to calculate the slippage is calculated by “multiplying the price impact constant by the square of the ratio of the order to the total volume”.
  • set_commission() lets you specify how much your trades costs. On default, Quantopian sets the cost to $0.0075 per share with a minimum of $1 per trade, but I increased the price to $0.01 per share since Interactive Brokers charges $1 per 100 shares.

You can read more about Quantopian here in the help docs:

Getting our Time Series

The time series we will be using is the combination of the two pairs in what is known as the spread. As you saw above when we discussed correlation vs cointegration, the spread is the difference between the two pairs. However, we cannot simply do first stock's history — second stock's history. Take into account if you were trying to find a possible tradable pair between Berkshire Hathaway’s class A shares worth around $270,000 and some penny stock. The variance in the spread would be insignificant! Thus we have to normalize the prices of the two by using the hedge ratio.

Firstly, hedging is the concept of using a different security to protect your investment. For example, if you exchanged your USDs for CADs, you can invest half of the CADs in some securities in the Toronto Stock Exchange (TSX) to give you a hedge ratio of 0.5. Similarly here, we are hedging our first security with our second security. Thus, using the hedge ratio, we can determine the relative prices.

In our implementation, we will use Ordinary Least Squares (OLS) to get the hedge ratio. In essence, the benefit of using OLS to calculate the hedge ratio is that it can take into account the past prices.

The formula for getting the spread will be Price A — hedge * Price B. I will show it again later in the code implementation.

Augmented Dickey–Fuller (ADF) Unit Root Test

Now before we move on to the trading logic, we need to go over some of the tests and measures you will use to determine whether you can make a trade. The first of these tests is the ADF test.

We need to get some statistical jargon out of the way before we get into the ADF test. We already went over briefly what a stationary time series is, but to give a more well defined definition, it is “a stochastic process whose joint probability distribution does not change when shifted in time.” A stochastic process is basically just a bunch of random points which are indexed over time (like a time series!). A joint probability distribution just means that, in this case, over time the value of those random points exist within a well defined band. More specifically, this band has a probability that could look like a Gaussian distribution like this:

Graphical representation of a joint probability distribution (Gaussian). Source.

Another term we need to go over is what a unit root is and how a unit root test can help us in determining whether we can make trades on a pair of securities. A unit root is within a time series if that time series shows signs of having stochastic trends such as points randomly increasing or decreasing in value, but never going back to the predictable trend of the series.

Red: general trend. Blue: mean reverting. Green: random walk. Source.

As you can see in the time series in red above, the series seems to be trending along the dotted black line. But at some point, that series starts to divert away randomly from that dotted line and has two options: go back to the trend by following the blue line or keep going along the green. If the series decides to go back on the blue, then the series is known to be mean reverting. If this does occur, then the time series is known to not have a unit root. But if the time series continues on the green, then the series is not mean reverting and contains a unit root.

The null hypothesis, i.e. the generally accepted fact, of the ADF test is that there is a unit root present in a time series. While the alternative hypothesis, i.e. the opposite of the null the model is trying to disprove, is that there is no unit root and the time series is mean reverting.

If we put all these terms and hypotheses, then we can see how the ADF test determines whether our time series is stationary by looking for the absence of unit roots. Now I will go over the implementation of it for our algorithm and the different parameters. For more information about the implementation, you can read the blog post by Python for Finance:

To determine whether we can reject the null hypothesis, we need to look at the parameters the ADF model creates. More specifically, we determine whether at our P-value, written above in the function use_P(), is between the values of 0 and 0.05. This stems from the concept of statistical significance which helps us determine whether we can reject the null hypothesis. As written here, “The null hypothesis is rejected if the p-value is less than a predetermined level, α. α is called the significance level, and is the probability of rejecting the null hypothesis given that it is true (a type I error). It is usually set at or below 5%.” In our case, the value of α will be 5%. We also test whether our test statistic is greater than the 5% critical value of the time series in use_critical(). The mathematics behind doing so are complex, but you can read a brief analysis here under the “Testing procedure” section.

Hurst Exponent

The hurst exponent mainly helps us determine whether a time series is mean reverting or not. The outputted value H from the hurst formula is some value between 0 and 1.

  • If H=0.5: then the time series experiences Geometric Brownian Motion.
  • If 0<H<0.5: then the time series is mean reverting. The closer to 0, the more “mean reverting” it is.
  • IF 0.5<H<1: then the time series experiences positive or negative correlation (i.e. autocorrelation) over a long period of time.
Graphical representation of various H values. Source.

I mentioned that when H=0.5, then the series is experiencing Geometric Brownian Motion — but what exactly is that? If you want to skip ahead to the code, you can simply think of it just being random walk as you likely intuitively thought of. But for those of you interested, random walk is more well defined then random movement up and down over time. As an aside, I’ll go over the difference between various “random walks”.

  • Simple Random Walk: like the flip of a coin (i.e. a Bernoulli random variable), it’s the sequence of variables that move either up or down by 1 unit with equal 50% probabilities.
  • Gaussian Random Walk: similar to the former, except this time the step size taken exist within a normal distribution with a mean of 0.
  • Brownian Motion: stemming from the discussion here on page 1, it first acts like a simple random walk, but converges to becoming a Gaussian random walk due to the sum of -1 and 1 over time sqrt(t) being a normal distribution.
  • Geometric Brownian Motion: similar to Brownian motion, except this time it is the logarithm of the sequence of variables that experiences Brownian motion, in addition to experiencing stochastic drift (i.e. random change in the mean).

There are more robust definitions you can look at online, but this an intuitive explanation of some of the differences between various “random walks”. These differences are so small, that sometimes statisticians use one over the other not because they are more accurate, but because they are simpler/easier to use! Also, sometimes these names are, due to historical reasons, mixed up like Brownian motion and the Wiener process.

“This is a simulation of the Brownian motion of 5 particles (yellow) that collide with a large set of 800 particles. The yellow particles leave 5 blue trails of random motion and one of them has a red velocity vector.” Source.

Firstly, the predefined values:

  • h_min and h_max does what you expect: the H value must be between 0 and 0.4 to be considered (reasonably) mean reverting.
  • look_back will be used later, but it describes how many days must have passed before you can start using this test, and is used to pass the last (in this case) 126 days of info about the pair.
  • lag_max says how large the time lags will get when calculating the Hurst exponent. Lags can be simply thought of how much we delay a certain time series behind another one.

For a more in depth analysis of the formula and implementation, you can refer to the following articles:

If you’re curious, you can read more about the Hurst exponent in the following links:

Half Life

Determining whether a series is mean reverting over some time frame is not enough. What we need is to determine whether that series will experience mean reverting properties we can exploit within a reasonable time frame. For example, we don’t want to open an order on the prediction that the pair will revert back to the mean a year from now.

This is where half life helps. By looking at the original time series and a (time) lagged version of itself, we can run linear regression against it to get a beta value (i.e. the slope/coefficient of the regression). And then we can pass that to the Ornstein-Uhlenbeck process.

This process incorporates Brownian motion — except this time it is mean reverting. As seen here and below, the stochastic differential equation (i.e. like a regular calculus differential equation but with one or more terms experiencing a stochastic/random process) has a coefficient α called the “speed of reversion”.

This can be used to calculate the average time it will take to get half way back to the mean (i.e. the half life!).

Firstly, the predefined constants:

  • hl_min and hl_max say the minimum and maximum value of the half life.
  • look_back is the same as with Hurst.

Trading Logic Function

Now we are ready to take what we learned about the various tests we learned about and apply them to decide whether we should open a position, close a position, or do nothing at all.

The function that we put in schedule_function above in the Initialize section was my_handle_data():

First thing we determine is whether there are any open orders. If not, we’ll continue and start looking at pairs in process_pair(). Because this function is long, I’ll break it down to smaller sections.

Firstly, we need to extract all the necessary values and calculate our hedge ratio so we can get our time series of the spread of the two pairs. After extracting all the value, we get the hedge ratio and store it for later, check whether we have went through context.hedge_lag number of days, and then calculate the spread.

We then check if enough days have passed to be able to use all our tests, run our tests, and check whether all of our tests passed. If all of them tell us that this pair creates a mean reverting stationary time series, then we skip the if block on lines 20–31 and move on to possibly executing a trade. However, if one of the tests tell us it isn’t, we go ahead and close our position if we have an open order on the pair (as we can’t determine whether they will revert back to the mean anymore) or just skip this pair entirely for the day and try again the next day.

Now we calculate the Z-score for our given spread. The Z-score in this case tells us for the given spread, how many standard deviations is the current price away from the mean price over some given look back window (i.e. z_back). Then we record some values to appear in the graph (NOTE: these records are hard coded for my specific pairs in this implementation).

Now that we have everything, we can start deciding whether we should make a trade or not. In the first if statement, if in_short is True, it means that we already opened a position where the Z-score was positive and we thought that the current spread is too high and will go back to the mean. And if the Z-score becomes negative, it means we have started reverted back to the mean and it is time to close the order. The same logic is true for the next if statement. The order_target() function is another Quantopian API specific function which makes the order for security specified.

Same logic Z-score and being long/short applies here when we want to open an order. If the Z-score is less (greater) than our threshold and we don’t have a long (short) position already open, create an order.

We use the softmax function to calculate the percentage of how much of each security in the pair we should order by passing in the hedge ratio and the relative prices. Through my research, I haven’t seen anyone use the softmax function to calculate the relative percentage of how much of each share one should buy. But after using it in a deep learning project I worked on a while back, I thought of giving it a shot. In that project, I created a convolutional neural network (CNN) to detect your emotions through your facial expressions. For example, these are the results my model creates for the following image:

Screen cap of image from webcam fed to CNN.
  • Angry: 0.00982473%
  • Fear: 0.206567%
  • Happy: 97.8469%
  • Sad: 0.934201%
  • Surprise: 0.00566902%
  • Neutral: 0.0242221%

You can see it here:

But generally, softmax is very popular in deep learning in your final layers where you want your model to tell you what it thinks your input is. It does this through normalizing k amount of values to be between 0 and 1, where the k values add up to 1. As you can see, this is helpful in that it could tell us the relative percentage of how much of each security in the pair we should get based on how much each would cost! However, there may be some downsides to uses softmax to get my percentages since it doesn’t normalize values linearly.

Softmax function in neural networks example. Source.

Results/Next Steps

You can find the full implementation here:

If you were to follow my exact implementation as above, you would get the same returns as in the graph at the beginning of the article. However, I recommend playing around with the different pairs and also different parameters. Here are some parameters are recommend playing with:

Initialize()

  • context.z_back: it’s good to change this while you change the min/max half life in the half life function. The larger it is, the more values you can look at to calculate your mean.
  • context.entry_z: increasing this value means your pair’s spread has to deviate even more from the mean for you to open an order.

Half_Life()

  • hl_min and h1_max: changes these helps change how long you think your algorithm will want keep orders open for (think long term vs short term).

Hurst()

  • lag_max: as described here, it can be tricky to figure out what value you want and that depends on the look_back window size the Hurst function looks at.

ADF()

  • p_max: so you can configure just how confident ADF is that the null hypothesis is false.

Miscellaneous

  • Configure how many tests you’re using. Maybe you don’t want to use ad.use_P() and adf.use_critical() but only one of them.
  • Try out different possible pairs.
  • See how the algorithm does during different timespans.

Finally, as next steps, I need to figure out how to control how my algorithm leverages positions and bring that drawdown value under control as currently, it’s borrowing a lot of money! I suspect it’s with my softmax function and how I use order_target_percent(). From both of these, I think beta and drawdown will naturally start coming down. Sure the returns will decrease, but these are more important values I must optimize first

Shows just how much my algorithm is borrowing at certain points. if only brokers let regular individuals borrow a couple million…

With a bit more work, I feel may even be able to start making some trades on my own tiny investment with real money! As my next algorithm, I’m still not sure what I’ll do, but I’m thinking of trying a bit of ML since Quantopian does support Scikit-learn! :)

)