LLMs and Algorithmic Trading
In the past year, the profound successes of Transformer Neural Networks, especially Generative Pre-trained Transformers like OpenAI’s GPT-4 and Meta’s LLaMA2, have sparked the imagination of those in the financial industry. During recent traditional finance conferences I’ve attended, participants have been abuzz with speculation about how these state-of-the-art AI models could disrupt conventional investing and trading. This post aims to provide insights on how advanced machine learning models might integrate into the world of algorithmic trading, drawing from my experience in the fields of computer science, machine learning, and quantitative finance.
I received my bachelor’s and master’s degrees in computer science from Harvard with focuses on artificial intelligence and electronic commerce in 2010, and I spent the following eleven years designing algorithmic trading systems at high-frequency trading firms. My thesis at Harvard dealt with applying machine learning to predict future moves in the game of Go, the ancient turn-based strategy board game (https://www.semanticscholar.org/paper/Move-Prediction-in-the-Game-of-Go-Harrison/8731cf873d36171e0d2ec895184aaf659e315abe). At the time I began my research, AI had notorious difficulties with Go. The landmark achievement of AlphaGo defeating a lower-ranking professional human player without any handicap was still seven years away.
Compared to the general problem of designing artificial intelligence to play the game of Go, the problem of move prediction is a more straightforward supervised learning task. Several prevailing machine learning algorithms existed, but all had a similar flavor:
- Feature Selection: Hand-pick a set of features based on knowledge of Go game rules and strategy (approximate territory counts, chain liberty counts, stones captured, etc.).
- Model Assumption. Select a probabilistic estimator that evaluates the strength of a board position given the computed feature vector.
- Training and Backtesting: Compute the empirical values of the estimator using a training set of actual professional Go game records, and evaluate how the model performs on an unseen portion of the data set.
At the time, the strongest-performing methods for the second step were a combination of Bradley-Terry scoring of feature vectors (a ranking algorithm similar to Elo), and Monte-Carlo simulations of game outcomes from the given board position. These models, although achieving modest move prediction accuracies of about 35%, were computationally demanding and suffered from overfitting as the cardinality of the feature vectors increased.
My approach to the problem was different. By considering the curated set of features to be linearly independent and using a sort-of inverted Naive Bayes algorithm, I achieved similar accuracies but with significantly less computational strain.
A year later, when I began developing automated trading systems for Jane Street Capital, I saw parallels between the most effective models in quantitative finance and those that produced the best results in Go move prediction. Most algorithmic (and even non-algorithmic) trading approaches hinge on very similar core principles:
- Feature Selection: Traders choose a set of features they believe will predict the movement of a financial instrument. These could be based on expressions of fungible relationships, such as the relationship between an ADR, its underlying foreign stock, and relevant FX rate. Or they could be derived from statistical relationships between time series of related instruments in a similar sector or index. Or they can even be based on instrument-agnostic microstructure features, such as the imbalance of resting sizes in the central limit order book.
- Model Assumption: A mathematical relationship is assumed between the chosen features and the expected market movement, either directly through understanding the fundamental composition of the instrument, or indirectly through choice of some kind of statistical or machine learning model.
- Training and Backtesting: Weights are assigned to the selected features either manually or through techniques like regression or hill climbing. The model is then back-tested on unseen data, and if deemed fit, is deployed.
- Continuous Monitoring: If the model falters or ceases to perform, traders delve deep into the data to ascertain the reasons. This step is crucial to ensure adaptability and long-term performance.
Aside from the clear overlap between these principles and those that guide supervised machine learning, I observed another surprising connection between the two fields that is hard to appreciate from outside the secure walls of HFT: the most profitable trading models tended to be extremely simple. They relied on a small number of intuitive features, expressed intuitive linear relationships between instruments, and produced intuitive trading decisions, in a way that was analogous to my Naive Bayes move predictor. To the extent that larger numbers of features were explored in the model design process, many were cut by the time the models were deployed in production, or the dimensionality was reduced through techniques such as PCA.
These two unique but ultimately related set of experiences have informed how I view the possibilities of Large Language Models (LLMs) like GPT either enhancing or replacing various processes in algorithmic trading.
Due to the fundamental mismatch between the natures of financial market data and linguistically-derived data, the LLM will not be the machine learning model that achieves self-trained, fully-autonomous trading capabilities. Instead, LLMs will primarily be used to amplify the abilities of humans to perform the core tasks needed for algorithmic trading: feature selection; model assumption; trading and backtesting; and continuous monitoring.
Without technical background in artificial neural networks and quantitative trading, it might be easy to envision that something akin to GPT4 could function in a financial context as an omniscient oracle, a magical black box that combines all the above steps into one, with marketdata flowing through the ingress, and execution decisions and profits dispatched from the egress.
LLMs, however, are exactly what they say they are on the tin: useful at tokenizing corpi of data that have language-like properties. That encompasses a vast amount of naturally occurring data domains, whether Yoruba or English, C++ or Nim, political treatises or poetry, or even descriptions of games like Chess or Go (for fun, go ask ChatGPT to play Go with you).
But the pseudo-random time-series of financial instruments do not behave like language. They are dynamic, stochastic systems that are more fractal in nature in that they rarely show self-similarity. Time-series prediction accuracies are often highly sensitive to small perturbations in model outputs, compared to LLMs which are often operating under the assumption of a large space of acceptable answers. Also, LLMs famously make continuous monitoring requirements of trading difficult — if a billion-parameter model fails to achieve desired market performance, how do you “fix” it? And at present, LLMs aren’t always ready to handle arithmetic, let alone the precise formulae involved in high-risk investment scenarios (cf. https://x.com/littmath/status/1708327886866862202).
With all of these limitations understood, LLMs can be immediately employed in many of the individual processes that make up successful algorithmic trading strategies. The core principles of trading model generation described above are almost all inherently language-based, and so LLMs are a natural fit to amplify existing workflows in these areas. Here are several examples:
- Feature Selection: LLMs can delve into the granular complexities of data, assisting in identifying and authenticating potential predictive variables, often on a qualitative rather than quantitative basis due to the precision required for the latter.
- Data Scrubbing: LLMs can be adept and pinpointing irregularities and anomalies in vast quantities of data, suggesting refinements to ensure data purity and consistency.
- Sentiment Analysis: Capitalizing on their core strength, which is parsing and understanding language, LLMs can derive market sentiment from news articles, corporate filings, tweets, and other textual data sources.
- Code Generation: For actual practitioners of automated or semi-automated trading, the art and science of writing correct and efficient programs for computing live and historical trading decisions can be as or more difficult than model selection, and LLMs are already highly useful in this area.
- Production reporting: LLMs can quickly summarize portfolio positions, execution histories, and even trading system error messages to produce human-readable reports rather than rely on either manual work or time-consuming scripting tasks.
While the potential of Transformers in capital markets is significant, it’s essential to appreciate them not as a panacea but as sophisticated tools that, combined with human ingenuity, can elevate trading strategies and provide operational leverage to those employing them. As with move prediction in Go, or traditional HFT, the simple, intuitive uses of mathematics and computer science have more pragmatic value than any attempt to throw the newest technology available at a well-understood problem.