Keras & TensorFlow to Predict Market Movements and Backtest using Backtrader

Sabir Jana, CFA
Analytics Vidhya
Published in
6 min readSep 11, 2020

In my last article — ML Classification Algorithms to Predict Market Movements and Backtesting we used multiple scikit-learn based classification algorithms to predict direction for the stock movement. In this article, let’s use Keras & TensorFlow along with a few additional features to see if we can get better or comparable results. You can find the relevant Jupyter notebook used in this article on my Github page. Our approach will be as follows:

  1. Gathering Historical Pricing Data.
  2. Feature Engineering.
  3. Building and Applying Deep Learning Model.
  4. Backtesting of Strategy using Backtrader.

Gathering Historical Pricing Data

We will continue to use the Nifty-50 index for this analysis. We will download the daily closing pricing data with the help of yfinance python library, calculate daily log returns, and derive market direction based on that. We will visualize the closing prices and daily returns to quickly check our data. As the code is identical to my last article, let’s not go through it here again.

Daily Pricing and Log Returns

Feature Engineering

Additional to five days lagged returns used previously, let’s also use a few additional technical indicators such as RSI (Relative Strenght Index), Bollinger Bands, and Moving Average Convergence Divergence (MACD) indicator. I have used python library ta-lib to calculate the technical indicators. The python code for this section is as follows:

# define the number of lags
lags = [1, 2, 3, 4, 5]
# compute lagged log returns
cols = []
for lag in lags:
col = f'rtn_lag{lag}'
stock[col] = stock['returns'].shift(lag)
cols.append(col)
stock.head(2)
# RSI - Relative Strenght Index
stock['rsi'] = RSI(stock.close)
# append to feature columns list
cols.append('rsi')
stock.tail(2)
# Compute Bollinger Bands
high, mid, low = BBANDS(stock.close, timeperiod=20)
stock = stock.join(pd.DataFrame({'bb_high': high, 'bb_low': low}, index=stock.index))
# append to feature columns list
cols.append('bb_high')
cols.append('bb_low')
# Compute Moving Average Convergence/ Divergence
stock['macd'] = MACD(stock.close)[0]
# append to feature columns list
cols.append('macd')

Code commentary:

  1. Define a function to calculate 5 days lagged returns along with a list variable -cols. We are using this variable to append all feature column names that will be used for the model.
  2. Compute the RSI index as an additional column to our stock dataframe.
  3. Similarly, add columns for Bollinger Bands and MACD indicators.

Building and Applying Deep Learning Model

We will use a deep neural network model using Keras & TensorFlow API. Our approach is API hunting that means what and how to use the API rather than going into the mathematical explanation. Please refer to {link} for more on Keras & TensorFlow. The python code for this section is as follows:

# split the dataset in training and test datasets
train, test = train_test_split(stock.dropna(), test_size=0.4, shuffle=False)
# sort the data on date index
train = train.copy().sort_index()
test = test.copy().sort_index()
# define a function to create the deep neural network model
def create_model():
np.random.seed(100)
tf.random.set_seed(100)
model = Sequential()
model.add(Dense(64, activation='relu', input_dim=len(cols)))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
return model
# normalized the training dataset
mu, std = train.mean(), train.std()
train_ = (train - mu) / mu.std()
# create the model
model = create_model()
# map market direction of (1,-1) to (1,0)
train['direction_'] = np.where(train['direction'] > 0, 1, 0)
%%time
# fit the model for training dataset
r = model.fit(train_[cols], train['direction_'], epochs=50, verbose=False)
# normalized the test dataset
mu, std = test.mean(), test.std()
test_ = (test - mu) / std
# map market direction of (1,-1) to (1,0)
test['direction_'] = np.where(test['direction'] > 0, 1, 0)
# evaluate the model with test dataset
model.evaluate(test_[cols], test['direction_'])
# predict the direction and map it (1,0)
pred = np.where(model.predict(test_[cols]) > 0.5, 1, 0)
pred[:10].flatten()

Code commentary:

  1. Split thestock dataframe created in the previous section into the training and test dataset. I have kept the parametershuffle=False and parameter test_size=0.4 . This means we are going to use the initial 60% dataset for training and the remaining 40% will be used for testing.
  2. Sort the training and test dataset using the DateTime index and define a function to build the model. The parameter input_dim=len(cols) is the number of feature columns. I have used activation=’relu’ for input and dense layers however you can explore other options. The activation function for the output layer must be ‘sigmoid’ as we are trying to solve a classification problem. As this is a classification problem, the loss function must be ‘binary_crossentropy’ however for the optimizer you can experiment.
  3. Next, we normalize the training dataset and create the model by calling the function created in the previous step.
  4. Map the market direction of (1,-1) to (1,0) for the training dataset and fit the model with x as normalized feature columns and y as market direction.
  5. Next, normalize the test dataset and map the market direction of (1,-1) to (1,0) for the test dataset and evaluate the model with the test dataset.
  6. Predict the market direction with feature columns of the normalized test dataset and map the predicted value to (1,0) based on whether it is greater or less than 0.5.

Now, based on our prediction we will calculate the portfolio position and return for the strategy, and visualize the cumulative returns for buy and hold vs. strategy returns.

# based on prediction calculate the position for strategy
test['position_strategy'] = np.where(pred > 0, 1, -1)
# calculate daily returns for the strategy
test['strategy_return'] = test['position_strategy'] * test['returns']
# calculate total return and std. deviation of each strategy
print('\nTotal Returns:')
print(test[['returns', 'strategy_return']].sum().apply(np.exp))
print('\nAnnual Volatility:')
print(test[['returns', 'strategy_return']].std() * 252 ** 0.5)
# number of trades over time for the strategy
print('Number of trades = ', (test['position_strategy'].diff()!=0).sum())
# plot cumulative returns
fig, ax = plt.subplots(1, 1, sharex=True, figsize = (14,6))
ax.plot(test.returns.cumsum().apply(np.exp), label = 'Nifty 50 Buy and Hold')
ax.plot(test.strategy_return.cumsum().apply(np.exp), label = 'Strategy')
ax.set(title = 'Nifty 50 Buy and Hold vs. Strategy', ylabel = 'Cumulative Returns')
ax.grid(True)
ax.legend()
plt.savefig('images/chart2');

Code commentary:

  1. Calculate portfolio position such that if the prediction is greater than zero we are long (1) else short (-1).
  2. Calculate portfolio return by multiplying position to actual daily returns.
  3. Calculate total return, annual standard deviation, and numbers of traders.
  4. Visualize the cumulative strategy vs. buy and hold return for the Nifty-50.
Nifty 50 Buy and Hold vs. Strategy
Total Return and Annual Standard Deviation

Wow! the outcome looks quite impressive. The total return for our strategy is multiple times of the ‘buy and hold’ approach. However, as I have discussed in the last article, this is a paper return and doesn’t account for any constraints.

Backtesting of Strategy using Backtrader

Let’s add some constraints and reality to backtesting with the help of Backtrader. I will keep the backtesting strategy identical to the one used in the last article — ML Classification Algorithms to Predict Market Movements and Backtesting and hence will not go through the approach and code all over again. Just to recall the summary:

  1. We start with the initial capital of 100, 000 and trading commission as 0.1%.
  2. We buy when the predicted value is +1 and sell (only if stock is in possession) when the predicted value is -1.
  3. All-in strategy — when creating a buy order, buy as many shares as possible.
  4. Short selling is not allowed.
Strategy vs Benchmark (Buy and Hold)
Strategy Performance of Test Data

Let’s analyze the performance of our strategy. The annual return is negative and the cumulative return is -33.26% as compared to more than 16 times the return observed during vectorized backtesting. If we visualize a few other parameters, we can see that our strategy has not performed well with the added constraints of ‘no short selling’ and trading commissions. Again, in conclusion, things may look great on paper with no constraints however the reality might be entirely different when we account for constraints and feasibility of strategy in the real market.

Happy investing and do leave your comments on the article!

Please Note: This analysis is only for educational purposes and the author is not liable for any of your investment decisions.

References:

  1. Python for Finance 2e: Mastering Data-Driven Finance by Yves Hilpisch
  2. Python for Finance Cookbook: Over 50 recipes for applying modern Python libraries to financial data analysis by Eryk Lewinson
  3. Machine Learning for Algorithmic Trading by Stefan Jansen
  4. Please check out my other articles/ posts on quantitative finance at my Linkedin page or on Medium.

--

--