Are Stock Markets As Unpredictable As We Think?
While many challenges stand in the way of algo-trading, advanced AI tech can help you beat the markets.
The rise of machine learning and the omnipresent AIs poses quite a variety of questions for humanity as it enters the era when everything around has the word “smart” in its name. How do we make sure that smart machines do not leave humans out of jobs and, thus, means to sustain themselves? How do we maintain people’s privacy in a time when data, the AI fuel, is such a major business enabler? Will we ever get to the point where advanced mathematics can approximate the human cognition, with all of its virtues and drawbacks, and how do we remain human in the age of singularity?
These are all good and valid questions to mull over, but there are also dozens and dozens of more pragmatic things to consider. One of them is the way the AI technology is transforming the financial sector, and here, one of the most difficult, yet most alluring challenges for machine learning is this: can AIs predict the stock market with the efficiency that leaves the conventional investment strategies behind? Can AIs beat the stock market?
No, they cannot, argues a recent article published by Bloomberg, or at least not in the nearest future. The market data fed to the AIs is nonstationary, evolving on the go, there is more noise than signal, much to the confusion of AIs trying to filter out all the former and focus on the latter, and there are dozens of details and intricacies that have to be accounted for by the model for investors to be able to make consistent profits.
Now, these are all good and solid points, definitely worth being taken into consideration in any discussion regarding the future of AI in trading. However, here at I Know First, we believe that markets are, to the contrary, largely predictable. An accuracy of 100% may indeed be mythical at this point, but consistent returns on AI-driven trading are still very much possible, as suggested by the examples cited in the original article and our own estimates like this one. Furthermore, you do not have to be a Wall Street moneybag to be able to gain access to advanced algorithmic stock predictions and make a profit off it. So what is it that makes the markets more predictable than we might think? Let us find out.
Data Keeps Changing, So Do The Models
As rightfully noted in the original article, financial market data is prone to change. This is true for interest rates, true for the earnings and expenses of various publically traded companies, true for hundreds of other factors that a skilled financial advisor would look at. All these can have an impact on the dynamics of various stocks, sectors and markets, and is thus relevant for a machine learning algorithm.
Now, in machine learning in all of its shapes and forms, one of the fundamental decisions to be made by the developers is the bias versus variance tradeoff. In essence, it comes down to the following question: where is the point beyond which the model, trained on dataset A, becomes so tailored to fit this specific dataset that it is no longer able to adequately process new data? Where is the fine line between the model being too general and too specific?
When dealing with a nonstationary dataset, this issue is of special importance, because a future change of tides can be very confusing for AIs. For example, an AI trained on a historic dataset covering a long bullish period can lose all touch with reality when bears run amok. With stock markets, you often have to expect the unexpected and brace for less-than-likely scenarios, and statistical inference is not too good with that… Well, not unless you do some tinkering with the chaos theory, but we will keep this point for later.
For now, the issue is that AIs do not do well with data that tends to evolve on the go — and that is why they have to evolve with it.
Enter reinforcement learning, a technique mimicking the way we, humans, interact with our world as we explore it. The idea is as follows: an AI is trying to maximize the reward (i.e. a parameter representative of the outcome that the humans want to obtain from this AI) through interactions with its environment. After each interaction, it reconfigures its models based on the information obtained from these interactions and tries again. This technique is popular in the AI sphere: for example, Google engineers used this approach to teach a model to walk and find its way through a digital obstacle course.
In a similar fashion, our stock-predicting AI can learn from its own successes and failures and re-configure its approximations of the inner workings of the market every day, as it is fed new data. It is still important to take the historical perspective into account, of course, to avoid overfitting the AI to what is happening today, but this is largely a matter of fine-tuning the cost function. That is, at least as long as the AI we are talking about has been trained on a set of historical market data rather than just thrown into the fray off the bat.
Thus, we would argue that while an AI basically has to have a reinforcement learning component to be able to keep in line with the market dynamics, its implementation does allow us to account for the fact that the data is nonstationary to a degree that allows the aforementioned AI to be commercially relevant.
Too Much Noise? Chaos Theory To The Rescue
As rightfully noted in the original article, on top of being prone to change, the stock market data tends to be quite noisy. In other words, it is not always immediately clear why exactly stock B is behaving the way it is behaving today, and this can indeed be confusing for AIs trying to find trends and patterns in the buzz. What can you do, one can ask, trading is not always a fully rational process: when tensions run high, investment decisions can be made based on pure emotions. Or on a rumor that will prove to be a fake. Alternatively, something along the lines of the 9/11 could happen, sending shockwaves through markets, or a couple of lines in the computer code at the stock exchange could trigger a massive sell-off.
Sounds a bit chaotic, doesn’t it? Well, that’s because it is.
Here, it might be worth noting what a chaotic system is. A chaotic dynamic system is one that can be thrown off balance by a relatively small event, which may not even look that relevant at first sight. Remember the famous metaphor, a butterfly in Brazil producing a tornado in Texas through nothing more than peacefully flapping its wings? This is exactly what we are talking about here. Stock markets, where thousands of actors shape the price dynamics with their at times less-than-rational decisions, are quite easy to imagine as such complex dynamic systems.
Going back to the sudden nosedives and surges, we can say that, statistically speaking, such events are of very low probability, belonging to the very tails of a normal distribution. However, in a chaotic system, these events are not that improbable, as everyone who works with the stock markets has probably already noticed. This means that our predictive model cannot assume that they come from a normal distribution; what we are dealing with is a fat-tailed one, where extreme scenarios are more likely. To account for this, we will rely on fractal time series analysis rather than the classical algorithm. All of this would help us filter the noise out.
Now, this section of the original article also speaks of a troubling lack of historical data to train our model. Since the trading records can be recovered for a period of roughly up until 1900, this leaves us 118 years worth of data to train our model, which does not sound too optimistic. Especially if we are looking at a one-year time horizon, in other words, try to predict the dynamics year-on-year.
This is a very fair point, especially if we take into account the fact that models like neural networks are at their best specifically when they are given gigantic datasets to munch on. However, we would argue that it’s not all doom and gloom on this front.
Acknowledging all the legitimacy of the concerns, we must also point at the fact, that in terms of training a time series prediction algorithm, the main question when it comes to data is not whether it is big enough to satisfy the hungriest of neural networks. The main question is whether there is enough data for the algorithm to pick up the seasonal patterns within the data. In case with every dataset, this would have to be calculated separately.
However, for the argument’s sake, we can point at the conventional wisdom saying that a dataset covering 2 to 6 iterations of the season that our prediction time horizon is part of would do fine as the bare minimum, depending on the exact model used. In other words, if we want to predict something week-by-week, we would require a dataset covering from 2 to 6 months. For yearly predictions, thus, it would make sense to cover from 2 to 6 decades. Furthermore, if we switch to a smaller time unit, our dataset of 118 years would exponentially grow, making for enough data for the algorithm to munch on.
State Of Singularity, Wall Street Edition
Wrapping our argument up, and commending the points made in the original article on deep learning and stock price slippage, we can go for an interesting thought experiment and question the very foundation of the issue at hand. That is, do we really want an algorithm, no matter how advanced, to do our finances, or is there a case to be made for human decision-making bolstered by AI capabilities?
This question takes aim at the very foundation of human cognition versus that of an AI. We the humans, as the discipline of knowledge management argues, start with data to derive information from it. Data is basically the rawest of the raw in this scheme, while information is about the patterns and correlations in this data. From there, we go to knowledge, which is, strictly speaking, our understanding of the information, reinforced over multiple iterations and significantly more thorough and nuanced. This is pretty much where we step beyond correlation and establish causality. From there, we go to wisdom — our ability to leverage this knowledge to deliver the outcomes we want.
Now, it is also possible to reverse the order and argue that our knowledge of the world determines the information we want to extract from it, which, in its turn, guides us in what data we seek to collect. This, again, makes for a nice thought experiment but drives us away from the actual point — that for computers, it all pretty much stops at the information. In their current shape, AIs process input, as intricate and humongous as that could be, establish statistically-meaningful patterns in it and generate output based on those. They do not really establish causality, the best they can do on this front is approximate meaning and logic through some very complicated maths. But these approximations would till fall short of what the human cognition is capable of.
There is also the other end of the stick — no human cognition can crunch the numbers at the speed of an AI. Furthermore, Google’s experiment with the ancient game of Go, which saw the deep learning-based AI come up with moves that were a surprise for a legendary human player. And that’s in the best-studied board game in history! The lesson to learn here is that self-learning AIs can sometimes note things that notice things that would have gone unnoticed by humans.
This, ultimately, makes us believe that the future is not in AIs doing all the trading and investing, but in tapping into the best worlds — in human portfolio managers making their decisions with the assistance of advanced AI-based tools.
And thus, to truly beat the market, AIs will have to work hand in hand with humans.