Numerai: Another Nail in the Efficient Market Hypothesis Coffin

Shiller First to Say “Emperor Has No Clothes”

In his 1984 paper, “Stock Prices and Social Dynamics,” Professor Robert Shiller pointed out a startlingly obvious error in the Efficient Market Hypothesis. Simplifying greatly, the EMH says “stock prices always reflect their actual intrinsic worth,” but all the studies that purported to test the EMH instead tested the question “are stock prices hard to predict?” Professor Shiller earned the Nobel Prize in Economics largely for saying “hold on, those are two fundamentally different things.” Professor Shiller didn’t pull any punches in his attack on the EMH:

…claims that because real returns are nearly unforecastable, the real price of stocks is close to the intrinsic value... This argument for the efficient markets hypothesis represents one of the most remarkable errors in the history of economic thought. It is remarkable in the immediacy of its logical error and in the sweep and implications of its conclusion.

Just because stock prices are hard for humans to predict, doesn’t mean they are correct!

Recurring Mispricings

Once Professor Shiller pointed out this obvious error at the core of the EMH dogma, even the “hard to predict” part came under attack. Many studies have gone on to show there are recurring instances of stock price movements being predictable on a statistically significant basis. Of course, stock markets are adaptive systems, with significant incentives in place to drive out the largest and most reliably predictable price movements. So, once a recurring predictability is uncovered, it tends to go away.

It seems to me there are a few big observations/trends about these predictable price movements:

  1. With some work, it is possible to find statistically significant mispricings that recur when some set of circumstances prevail. A problem with this is that a model that makes a good prediction say 60% of the time, but can only be used on one month out of ten is very difficult to allocate capital to, unless one has access to a large collection of independent models, at least some of which work during different time periods.
  2. The advances in AI/machine learning data analysis tools are making it dramatically easier to uncover many classes of statistically significant mispricings, if you can get access to the relevant data quickly and cheaply.
  3. If you uncover a recurring pricing error, you don’t want to let anyone know what it is! You don’t want to upload your model to any third party site. You may not even want anyone to know who you are.

Enter Numerai

I had those opinions in my mind the first time I met Richard Craib, the CEO of Numerai. What I saw in the incredibly powerful Numerai business model was a way to anonymously empower a large number of data scientists to create a large number of independently identified stock mispricings that capitalized on the increasing power and ease of use of AI/machine learning tools. This large number of independently generated predictions could be tested for predictive power, combined in such a way as to reduce variance, and have capital allocated to them accordingly.

I participated in Numerai stock prediction tournament in February of 2016. The whole process only took me 90 minutes to download the training data, build a very simple model, and upload the prediction data set. I didn’t have to share any personally identifying information, nor did I have to share any of the details of how my model made the predictions it made. I actually uploaded two separate prediction data sets:

  1. A random set of predictions. I literally used a random number generator for this prediction set. I was happy to see that 92% of the existing submissions beat my random submission. This told me that most of the predictions had at least some predictive power.
  2. A prediction set based on my very simple model. This model was beaten by 87% of the existing tournament submissions. This told me the overwhelming majority of the submissions were the result of thoughtful predictive models. I had to fight the urge to implement the long list of potential model improvements that would have allowed me to climb up the tournament ranking list.

Based on this demonstrated fit with a huge latent need, the strength of the founding team, and the natural attractiveness of the Numerai business model, I asked Richard if I could participate as an investor in Numerai itself, which he thankfully agreed to despite it being an oversubscribed round.

As of today, thousands of data scientists have made over 20 billion stock market predictions through their participation in Numerai tournaments. Some of the best performers have indeed chosen to remain anonymous, and take their payments in Bitcoin. Also, Numerai has shifted tournament scoring to the degree to which their submissions uniquely improve the Numerai meta model, to reward the creation of additional predictive power at the aggregate level.