Noah Harasz
Numerai
Published in
6 min readApr 1, 2021

--

If you’ve heard of Numerai and read any of our documentation you’ve likely heard of Numerai Compute and it’s associated tool, numerai-cli (if you haven’t heard of us, check us out). This was great if you wanted to create some resources in AWS, deploy your models to the cloud, and allow Numerai to trigger your models automatically, leaving you to sip your beverage of choice, put your feet up on your desk, and watch the daily scores roll in.

Currently, it’s supported anywhere where Python 3 and Docker are available including MacOS/OSX, Windows 10 (and 8 with some extra work), and Ubuntu 18+. We recently released the 0.2.* version series which provided some setup scripts to increase accessibility, allowing you to run just 5 commands to start automatically submitting from the cloud. Creating autonomous Numerai models has never been easier.

What’s new?

Despite this, our CLI (and the Compute Cluster you set up with it) has been feature poor; you could only deploy 1 code base to 1 cloud provider and the example Python code was mom’s spaghetti [code]. If a user wanted to parallelize their models, they would have to figure it out. If a user wanted to debug their deployed prediction node, they would have to figure it out. If a user wanted to modify the example Python code, they would have to figure it out. These issues make Compute inaccessible to the majority of current and new users.

So what are we doing about it? In Numerai CLI 0.3.0, the example Python code has been streamlined from 3 examples totaling over 500 lines of code, to 1 example under 100 lines. Our Cloud Provisioning code has been modularized to a plug-and-play design so we can continue adding new cloud providers like GCP. Finally, our CLI has been restructured to be centered on Prediction Nodes (more on those later); this means you can run your models however you want, deploy them wherever you want, and submit to any tournament you want on a per-model basis. Compute customization has never been easier.

The One True Python Example

Streamlined, easier to read, and no longer tied to complex, deprecated packages, we’ve taken steps towards easing the learning curve by using every basic concept you need in one file. This includes introducing the idea of multiple models in the first few lines as well as simple, atomic functions for retrieving data, fitting, predicting, and submitting; all of this in less than 100 lines of code. We’ve also included a new example model for Numerai Signals, which introduces the concepts of pulling in your own financial data and generating your own features. It’s certainly more complicated than the tournament example, clocking in at just under 400 lines in total, but each component is still logically separated into a functional paradigm to allow a novice data scientist to read and understand how to analyze market data.

In the future, these examples will serve as the stepping stones for both veteran data scientists and newbies alike by encouraging everyone to simply change a few lines to improve their models and continue making simple changes until they’ve become a Numerai Master.

A useful and simple first example is vital in our Master Plan. We can’t monopolize intelligence, data, or the market unless we diversify our users, bring in new ideas, and make our strategies better than the current hedge fund overlords. Being a beginner has never been easier.

Prediction Nodes Everywhere

We imagine a future in which Numerai can ping any AI on any infrastructure for any prediction at any time. This isn’t feasible unless we architect the tools to build this future; Rome wasn’t built in a day and it wasn’t built with a stone hammer. Numerai CLI 0.3.0 aims to give users a better way to build and deploy any model by re-architecting around the principle of plug-and-play modules.

Prediction Node Interactions in the Numerai Network

With this design, we introduce the idea of a Prediction Node — a generic building block for submitting predictions. These applications are agnostic to cloud providers, tournaments, data sources, and languages. At its base is a basic Docker image, deployed to the cloud, with an auto-scaling cluster of computers that will run a container when triggered by Numerai. You want a Python XGBoost model submitting to Numerai Signals from AWS? No problem; a few small changes to our example code and you’re up and running. You want to use a neural network created in C++? Of course, just add a Dockerfile and automate it using any of the supported Cloud providers with just a few commands.

With a Prediction Node configured, Numerai can invoke the predictive power of your model whenever it needs foresight into the market, paying or burning you for your performance. This concept is crucial to the Master Plan, because reliable submissions are vital to reliable performance. The CLI is the tool our data scientists use to build a reliable network of models, thereby fortifying the emergent meta model supremacy that will control the market. Forging the path to this future has never been easier.

Helping You Help Yourself

Along with updated examples and the introduction of Prediction Nodes, the new 0.3.0 update now provides basic tools for users to test, diagnose, and debug their local and deployed environments. The first of these tools is a full end-to-end testing system for deployed Prediction Nodes including checking that it can be triggered by Numerai, it can run successfully, and that it submits correctly. Another tool is the new “numerai doctor” that verifies certain parts of the local environment are set up correctly; in the future this tool will be the go-to self-help command that will diagnose, fix, or bug report issues with your environment. Finally, we’ve managed to complete an alpha test of this version before launch that helped us update documentation, error handling, and overall UX such that errors and bugs can be handled more gracefully and reported correctly. Detecting the reliability of your model has never been easier.

What’s next?

This update breathes new life into compute and signifies the next era in the Numerai Compute experience. We want to make setup more streamlined and reliable, increase configurability, fortify the reliability of your Prediction Nodes, improve submission transparency, and enable collaboration. This sounds like a lot, and it is, but it’s necessary to ensure the development of our Master Plan.

We want to make it easier for you to deploy your models how you want so you can receive feedback, learn, and tweak your models without worrying about submission deadlines. Automating your submissions means you can reliably upload predictions; spend more time tweaking your models and less time making sure your model is stable.

We want you to have more options with how you configure your Prediction Node and where you deploy it. We don’t want to force you to be a cloud engineer, so we want to give you tools that will allow you to deploy your models into other compute providers like GCP or Golem. Furthermore, we want you to be able to harness vital resources for your machine learning and analytics like cloud GPUs.

We want you to collaborate and learn with our amazing community of data scientists. Adding tools to enable this collaboration will allow you to make better models. We need more amazing models; specifically, we need more models with incredible predictive power submitting reliably, a lot more.

Whether you’re a Numerai Veteran or you want to jump in for the first time, automate your submissions today with Numerai Compute 0.3.0 and provide your feedback on RocketChat because Numerai Compute makes things easy.

For a more Technical Breakdown of the CLI 0.3.0 update visit the Numerai Forum and see our post.

--

--