Training RL models for financial trading using TensorTrade on Amazon SageMaker Studio

4 min readJul 9, 2020

Introduction

Reinforcement Learning (RL) is a branch of Machine Learning that enables an agent to learn an objective by interacting with an environment. Although the concept of using RL for financial trading has been around for a while [1, 2], there has been a surge in the number of articles [3, 4, 5, 6 and a few other blog/Medium posts] published on this topic recently.

TensorTrade is an open-source Python framework for building, training, evaluating, and deploying robust trading algorithms using RL⁷. It was originally built by Adam King for trading cryptocurrency. More recently, kodiakcrypto has contributed a few features on their GitHub fork, that includes a working example notebook.

Amazon SageMaker Studio is an IDE for Machine Learning that was announced at AWS re:Invent 2019 [8]. Amazon SageMaker Studio has several features such as Notebooks, Experiments, Model Monitor, AutoPilot and Debugger. One of the features that I use frequently is the ability to switch the underlying instance type to reduce the time it takes to train an RL model. A screenshot of switching from an existing ml.g4dn.xlarge instance to ml.t3.medium instance is shown below.

Workflow

An example notebook of training an RL model on SPY historical data using TensorTrade is available on my GitHub fork. A video of running this notebook on Amazon SageMaker Studio is provided below.

Some of the key takeaways from this effort were:

TensorTrade for ETFs/stocks

A high-level overview of TensorTrade and an example workflow are available at this Towards Data Science Medium article. The example workflow demonstrates trading strategies for Bitcoin (BTC) and Ethereum (ETH). More specifically, BTC and ETH were declared as instruments that can be traded. Recently, the following ETFs/stocks were added as instruments to kodiakcrypto’s GitHub fork — AAPL, MSFT, TSLA, AMZN and SPY. Additional ETFs/stocks can be added to instruments.py file.

Data acquisition from Amazon S3 or Yahoo Finance

By default, the example workflow fetches hourly data of cryptocurrency trades from cryptodatadownload website. I’ve added a couple of data acquisition options in my GitHub fork to fetch data from either an Amazon S3 bucket or Yahoo Finance as shown in the screenshot below. Yahoo Finance provides historical data on daily, weekly or monthly frequencies. Users can store historical data of any frequency (hourly, daily, weekly, etc.) in an Amazon S3 bucket.

Comparison of Amazon SageMaker ML instance types

I trained the deep Q-network agent [9] on two instance types in Amazon SageMaker Studio — ml.g4dn.xlarge (4 vCPUs, 1 GPU and 16GB memory) and ml.t3.medium (2 vCPUs and 4GB memory). In order to have a fair comparison between the instance types, I ran the notebook with the Python 3 (TensorFlow 2 CPU Optimized) kernel. I varied the combinations of steps (25/50) and episodes (25/50) during training and the results are shown below.

It appears that training an RL model on a ml.g4dn.xlarge instance takes about half the time it would take on a ml.t3.medium instance.

Conclusion

This article described the process of training an RL model on SPY historical data using TensorTrade on Amazon SageMaker Studio. Some items that will be investigated in future efforts include training multiple RL models with different reward schemes, using ParallelDQNAgent in order to speed up the RL model training process and deploying the “most promising” RL model on Amazon SageMaker Hosting Services in order to obtain inferences on future data.

Bibliography

[1] Moody, J., Wu, L., Liao, Y. & Saffell, M. (1998), ‘Performance functions and reinforcement learning for trading systems and portfolios’, Journal of Forecasting 17, 441–470.

[2] Nevmyvaka, Y., Feng, Y., Kearns, M. (2006), ‘Reinforcement learning for optimized trade execution’, ICML 2006: Proceedings of the 23rd international conference on Machine learning, 673–680.

[3] Necchi, P. (2016), ‘Policy gradient algorithms for the asset allocation problem’, Master’s thesis, Politecnico di Milano.

[4] Bacoyannis, V., Glukhov, V., Jin, T., Kochems, J., Song, D. (2018), ‘Idiosyncrasies and challenges of data driven learning in electronic trading’, NIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy and Privacy.

[5] Huang, C. (2018), ‘Financial Trading as a Game: A Deep Reinforcement Learning Approach’, arXiv:1807.02787.

[6] Zhang, Z., Zohren, S., Roberts, S. (2020), ‘Deep Reinforcement Learning for Trading’, The Journal of Financial Data Science Spring 2020, 2 (2) 25–40.

[7] TensorTrade — GitHub, Medium, Discord and Gitter.

[8] Amazon SageMaker Studio announcement — https://aws.amazon.com/blogs/aws/amazon-sagemaker-studio-the-first-fully-integrated-development-environment-for-machine-learning/

[9] Mnih, V., Kavukcuoglu, K., Silver, D. et al. (2015), ‘Human-level control through deep reinforcement learning’, Nature 518, 529–533.