An inside look at the backtests at Numerai, and a conversation with Marcos López de Prado, Numerai’s new scientific advisor. Jump to the video presentation on YouTube.
Numerai is a hedge fund built by a network of data scientists around the world. We give away hedge fund quality data and allow anyone to model it using machine learning, and we combine the best models to create the Meta Model which manages the money in our hedge fund. But why is this a good idea and is it working?
The first assumption behind Numerai’s approach is that machine learning is better than standard statistical approaches for modeling financial data. It wouldn’t make sense to sign up thousands of data scientists around the world if building a hedge fund was just a matter of collecting data and running linear regressions.
It may seem obvious that modern machine learning models would be better than classical models but even though machine learning algorithms dominate on almost every data modeling problem, their performance on financial data is still in dispute by many academics and industry professionals.
Of course, it’s impossible to prove that machine learning models perform better on financial data in every specific instance but it is possible (and easy) to show that machine learning works a lot better than linear regression on the financial data that Numerai gives away.*
But showing that machine learning is better than linear regression still doesn’t prove that Numerai is a good idea, it just says that it might be. It’s possible that the data scientists signing up to Numerai aren’t good enough to outperform basic machine learning models. Numerai can easily average a collection of Python scikit-learn models ourselves without the help of our community of data scientists.
We need to show that machine learning beats linear regression on financial data (showing that there exist modern methods that improve on the status quo), and that the Meta Model beats machine learning (showing that Numerai’s approach to collecting models from data scientists around the world is better than other hedge funds’ siloed approach).
In other words, we need the Meta Model Supremacy Inequality to hold:
meta model > machine learning model > linear model
Backtesting The Inequality
With the data on Numerai, we ran a linear regression model and an XGBoost model. XGBoost is a machine learning model that builds a series of decision trees and ensembles them. Because it uses decision trees, it is able to learn non-linear structures in the data unlike the linear regression and tends to have strong performance on a variety of problems.
To build the Numerai meta model, we first select the 200 user built models that are staked with our cryptocurrency, NMR, because data scientists who choose to risk money on their models by staking tend to have better performance (skin in the game helps). And we then filter those further finding the top 100 that work well together (for example, we drop the models that are very correlated because having lots of uncorrelated models produces a better ensemble).
We then backtested** all the models on totally blind out of sample data and here’s what that looks like.
Looks good. Looks like the lines are in the right place and the inequality holds. The linear model gets a Sharpe 0.13 (I guess the market is pretty efficient to this technique…), the machine learning model gets a Sharpe of 0.91, and the meta model gets a Sharpe of 1.55. Based on this test, the Numerai Meta Model reigns supreme.
It looks that way. But it’s not magic. If you look closely, from February 2014 to October 2016 the Meta Model and basic machine learning model have made almost the same amount of money. That’s a long period for the Numerai users to not be beating a basic model in terms of return. If you take a look at the end of the backtest, you can can see all the models have a drawdown of similar magnitude. This was a particularly bad period for most quant funds; actually the worst period in 8 years. Many major quant funds lost big. And so would have Numerai’s Meta Model according to this backtest. And that’s not good because investors are interested in uncorrelated returns and when they talk to Numerai they’re expecting something special to come out of all the AI & crowdsourcing & blockchain we’re throwing at them.
So it was a huge milestone to reach Meta Model supremacy with these very big Sharpe differences but it gave us a hedge fund that was basically a decent above average hedge fund that was quite correlated with other hedge funds. Nobody who works at Numerai has ever been ‘decent’ or ‘above average’ or ‘similar to peers’ in anything they’ve ever done so this wasn’t exactly a celebration moment, and we didn’t want to just sit on our hands and wait for another 2014 ‘good time for quant hedge funds’ kind of year.
Extending The Edge With The Master of Robots
I’ve been following Marcos Lopez de Prado since he released these slides 7 Reasons Most Machine Learning Funds Fail and everyone I know emailed it to me. It turns out, Marcos had been following Numerai and was a big fan of our approach.
Marcos has two PhD’s. He founded and led Guggenheim Partners’ Quantitative Investment Strategies fund where he managed $13 billion. He was the first head of machine learning at AQR, the largest quant hedge fund in the world. He wrote the book on financial machine learning. And this year, he received the ‘Quant of the Year Award’ from The Journal of Portfolio Management. Bloomberg calls him The Master Of Robots.
He came to visit us at Numerai in San Francisco a few months ago and spent a week looking at what we were doing, and telling us what we were doing wrong. Marcos really liked our approach and he ended up writing a whole paper on it with Frank Fabozzi. But the details of the data science problem being posed on Numerai needed improvements.
We always understood that more data would help the users build better models; that’s like data science 101. But just giving more data wasn’t enough. We needed to frame the problem we present in a better way to make the contributions from each user more ‘monetizable’. For example, if a Numerai data scientist puts effort into building a model that is exposed to momentum features but we then remove all momentum exposure in our portfolio optimization process, then that data scientist’s modeling efforts are wasted (i.e. not monetizable). Essentially, we had this data science problem where we weren’t asking the right question. And since Numerai is in charge of the question being asked, our data scientists couldn’t fix that themselves; we had to. (Remember all Numerai data is obfuscated — just a huge grid of numbers between 0 and 1 so only Numerai has control of the set up of the problem the inputs and targets.)
There is a number we use to measure “monetization” and we basically managed to double that number by creating a much better target. And then we did the data science 101 thing and increased the number of features from about 40 to about 300. And we got to this place a couple months ago where we had way better data and way better targets so we released it to our data science community and deleted all the old stuff.
And then the Numerai Meta Model backtest jumped to a Sharpe of 2.09.
It’s way better, and we like it now. It now does quite well in the 2018 drawdown, actually earning some money. But it’s still not magic. It sometimes has larger drawdowns than our previous Meta Model. It sometimes spends months being flat. But it’s now a much better hedge fund and when people analyze the backtest, it looks uncorrelated to other funds and unique and special. And way above average.
Getting all the technology and community in place to achieve Meta Model supremacy took years but adding the new data took months, and hundreds of brilliant data scientists modeling the new data took days. And that’s what’s good about Meta Model supremacy; it’s like once you have it the edge stays with you when you add new data and it’s something other hedge funds don’t have that you do. Meta Model supremacy puts us in a place where all we have to do is pour more data into the Numerai system and pose good problems and we will always be able to produce the best possible model on any given dataset.
I’m sure all hedge funds will upgrade to machine learning eventually but they can’t upgrade to Numerai’s Meta Model. So that’s a nice defensible edge I hope we can keep and it’s a defensible edge in an industry without many defensible edges tbh.
Watch The Video
I presented some of these ideas and much more at Numerai’s first ever conference a few weeks ago. Here’s the video including my fireside chat with Marcos López de Prado, and an update on Numerai Compute from Anson Chu our VP of Engineering. (Also learn about a Numerai data scientist who’s also a NASA rocket scientist working on taking us to Jupiter’s moon.)
*You can prove to yourself in minutes that linear models don’t do well on Numerai’s data by downloading Numerai’s data for free and running the example Python script.
**These backtests are for illustrative purposes only and are all simulations and not representative of Numerai’s live trading or even the real backtests we share with our investors. You can trust that these backtests are out of sample and that the same optimization parameters are being applied to each model but don’t trust the numbers because there are hundreds of parameters involved in any backtest: cost assumptions, trading assumptions, risk levels, leverage assumptions, backtest period, stock universe choices that are not worth going into here but note that small alterations to these parameters can have very large effects eg halving leverage could halve return, changing cost assumptions could double Sharpe etc.