Neural networks for algorithmic trading: enhancing classic strategies

Alexandr Honchar
Oct 19, 2017 · 6 min read
Image for post
Image for post

Some of the readers have noticed, that I calculated Sharpe ratio wrongly, which is true. I’ll update the article and the code as soon as possible. Meanwhile, it doesn’t change the fact of enhancement of a basic strategy with a neural network, just take into account the “scale”.

Hello everyone! In five last tutorials we were discussing financial forecasting with artificial neural networks where we compared different architectures for financial time series forecasting, realized how to do this forecasting adequately with correct data preprocessing and regularization, performed our forecasts based on multivariate time series and could produce really nice results for volatility forecasting and implemented custom loss functions. In the last one we have set and experiment with using data from different sources and solving two tasks with single neural network and optimized hyperparameters for better forecasts.

Today I want to make a sort of conclusion of financial time series with a practical forecasting use case: we will enhance a classic moving average strategy with neural network and show that it really improves the final outcome and review new forecasting objectives you most probably would like to play with.

Previous posts:

  1. Correct 1D time series forecasting + backtesting
  2. Multivariate time series forecasting
  3. Volatility forecasting and custom losses
  4. Multitask and multimodal learning
  5. Hyperparameters optimization
  6. Enhancing classical strategies with neural nets
  7. Probabilistic programming and Pyro forecasts

You can check the code for training the neural network on my Github.

Main idea

Image for post
Image for post
Example of two moving averages crossing

But this trading strategy has one main pitfall: on the flat regions we will still do the trades on the points where nothing actually changes, so we will lose money:

Image for post
Image for post
Example of flat region where moving averages are crossing

How we can overcome this with use of machine learning?

Let’s check the following strategy hypothesis: on the moments where moving averages are crossing we will make the forecast of change of some characteristic, and if we really expect a jump, we will believe this trading signal. Otherwise, we will skip it, because we don’t want to lose money on flat regions.

As a forecast objective I want to try skewness — a measure of asymmetry of a distribution. Let us assume, that if we forecast a change in a distribution it will mean that our current trend (not only flat region) will change in the future.

Image for post
Image for post
Skewness of distribution

Input data

nine_period_high = pd.rolling_max(pd.DataFrame(highp), window= ROLLING / 2)
nine_period_low = pd.rolling_min(pd.DataFrame(lowp), window= ROLLING / 2)
ichimoku = (nine_period_high + nine_period_low) /2
ichimoku = ichimoku.replace([np.inf, -np.inf], np.nan)
ichimoku = ichimoku.fillna(0.).values.tolist()
macd_indie = moving_average_convergence(pd.DataFrame(closep))wpr = williams_percent_r(closep)
rsi = relative_strength_index(closep, ROLLING / 2)
volatility1 = pd.DataFrame(closep).rolling(ROLLING).std().values#.tolist()
volatility2 = pd.DataFrame(closep).rolling(ROLLING).var().values#.tolist()
volatility = volatility1 / volatility2
volatility = [v[0] for v in volatility]
rolling_skewness = pd.DataFrame(closep).rolling(ROLLING).skew().values
rolling_kurtosis = pd.DataFrame(closep).rolling(ROLLING).kurt().values

Obtained indicator features I concatenate with OHLCV tuples to generate final vector.

Network architecture

main_input = Input(shape=(len(X[0]), ), name='main_input')
x = GaussianNoise(0.05)(main_input)
x = Dense(64, activation='relu')(x)
x = GaussianNoise(0.05)(x)
output = Dense(1, activation = "linear", name = "out")(x)
final_model = Model(inputs=[main_input], outputs=[output])
opt = Adam(lr=0.002)
final_model.compile(optimizer=opt, loss='mse')

“Novel” point here is adding small noise to the input and to the output of the single layer of our neural network. It can work very similar to L2 regularization, mathematical explanation you can check in this amazing book.

Image for post
Image for post
Sample from

Neural network is trained in a usual way, let’s check how our forecasts of skewness can improve (or no) the moving averages strategy.

We train our network on AAPL prices from 2012 to 2016 and as test on 2016–2017 as we did in one of previous tutorials.

After training of a network I have plotted our close prices, moving averages and vertical lines on crossing points: red and orange lines represent points where we would like to trade and green ones — where we better don’t. It doesn’t look perfect, but let’s do backtesting to judge it.

Image for post
Image for post
What moving average intersection is useful?

Results without neural network

[(‘Total Return’, ‘1.66%’),
(‘Sharpe Ratio’, ‘16.27’),
(‘Max Drawdown’, ‘2.28%’),
(‘Drawdown Duration’, ‘204’)]
Signals: 9
Orders: 9
Fills: 9

Image for post
Image for post
Results of backtesting of a rolling mean strategy

Results with neural network

[(‘Total Return’, ‘3.07%’),
(‘Sharpe Ratio’, ‘27.99’),
(‘Max Drawdown’, ‘1.91%’),
(‘Drawdown Duration’, ‘102’)]
Signals: 7
Orders: 7
Fills: 7

Image for post
Image for post
Results of backtesting of a strategy with use of NN

Possible improvements

  • Different indicator strategies: MACD, RSI
  • Pairs trading strategies can be optimized extremely well with approach proposed
  • Try to forecast different time series characteristics: Hurst exponent, autocorrelation coefficient, maybe other statistical moments

With this post I would like to finish (at least for a while) financial time series forecasting topic using neural networks. Let’s be honest, it’s definitely not a Holy Graal and we can’t use them directly to predict if price will go up or down to make a lot of money. We considered different data sources and objectives, dealt carefully with overfitting and optimized hyperparameters. What conclusions we can do?

  • Be careful about overfitting! You will do it in 99% of cases, don’t trust values as 80% of accuracy of very nice looking plots — it must be a mistake
  • Try to forecast something different but close prices or returns — volatility, skewness, maybe other characteristics
  • Use multimodal learning if you have different data sources
  • Don’t forget to find the right hyperparameters!
  • Create a strategy that can be a mix of some classical and based on machine learning and backtest it!

I hope that this series of posts was useful to someone, I will come back rather soon with news topics… Stay tuned! :)

Follow me also in Facebook for AI articles that are too short for Medium, Instagram for personal stuff and Linkedin!

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store