What is the deal with Microprediction?

Published in

Databutton

6 min readNov 25, 2022

Have you ever heard of microprediction.org? At the outset, it is an online crowdsourcing platform for time-series predictions where you can either publish live data and get predictions, or compete with your own models. Kind of like good old Kaggle, but with live data and instant usefulness. However, beneath the surface, microprediction.org is an eccentric and mysterious place that sparks questions and reflection bordering the philosophical. Over the past days, I have been digging into microprediction, reading the material, and trying to apply it to a case I care about (will write more on the case in an upcoming post).

Is this a service or a philosophy?

It was the great Thomas Thoresen that made me aware of microprediction.org. He is currently using Databutton to feed Norwegian electricity prices into the microprediction service (a very cool case btw as the current energy crunch has resulted in an insane volatility. Taking a shower at the right time can save you like 5$). Since I have definitely spent my share of time working with time-series, the mere existence of something original and new in that world immediately sparks massive curiosity.

When opening microprediction.org (or the guide at microprediction.com) there is no shortage of content. Here you can find plenty of guides, opinion pieces, explanations and so on — but the more I read, the more puzzling the service becomes. After a while, the basic questions such as how to use the service gets answered while new deeper questions keeps piling up. When faced with a mystery, some people get frustrated and lose energy, whereas others gain an obsessive energy where they just have to get to the bottom of the mystery — and I am definitely in the latter camp.

Now, after days of pondering, I am starting to think that the mystery is caused not so much by the actual service provided by micropredictions.org, but by the shear ambitions, philosophy, and vision of the creator himself, Dr. Peter Cotton.‍

The market is always right

I could be wrong, but to me it seems that microprediction.org is really an attempt at creating a marketplace — like NASDAQ for stocks or Betfair for sports betting. A place where publishers lists “streams” while investors “trade” in predictions. The underlying belief of course being that across many investors, the market will balance towards the right predictions.

That is the exact same thing Databutton’s avid sports better and CTO Martin is telling me: “The odds on Betfair is as close to the truth as you will ever get, if you back-test past odds you will see that they are overwhelmingly accurate”.

‍I find the idea of creating a market for predicting time-series super-interesting. However, if you want people to spend considerable time creating models that get the right predictions, they need an incentive. Thus, in practise, if microprediction is to be a viable option for your time-series problems, money will need to enter in some shape or form. Today, microprediction.org does have some streams where they reward the daily best performing algorithm (https://www.microprediction.com/competitions).But, I haven’t been able to find a way to sponsor my own streams.‍

A short introduction to the practicalities

I am going to assume that you have never heard about microprediction and explain how it works. First, to do anything, you need the microprediction pip package and a “write key”.

It takes a long time for the key to be generated (like really long), so be prepared to wait. They key concept is a story on its own and you can read more about them here https://muid.readthedocs.io/en/latest/. I guess they exist to avoid people spamming the service with streams.

Creating a stream

It is super-easy to create a new stream and get predictions. For instance, to write to a stream every second, you can do something along the lines

Then, you can head over to microprediction.org, type in your write_key and find your stream listed along with all the other streams. Note that you don’t supply any timestamp so submitting historical values is impossible.

Getting predictions

When you start posting to a stream, you can retrieve predictions immediately:

This will return a distribution of predicted values from the currently best model (I think) for the given delay=time horizon (I think). It is also possible to get the prediction from several models and rank them yourself.

Contribute with your own predictions

I haven’t tried contributing to the community with my own predictions (which I probably should have). But, to give you an idea of how that is done, the tutorial code on micropredictions.org writes:

Here, the “model” is the scenarios list. Clearly, in order for this to work, you will also need historical values of the stream which you can get by

Is crowdsourcing the future of time-series prediction?

While models for tabular, image, and text data have had an insane development over the past 10–15 years, the same just isn’t true for time-series. I have certainly tried my share of LSTMs and friends with extremely variable results and quite frankly just grown tired of the whole thing. Of course, the typical time series cases are also extremely difficult with noise, non-linearity, non-locality, and limited predictability. But as Petter Cotton also writes, the lack of real data to use for development is probably a major factor. Thus, what innovation that will move the field forward in practise is imo widely open . The models and papers that do get published seems extremely limited in terms of real-life applicability (which mirrors my experience with them too). It would be great to see them in action on microprediction.org!

However, the reason why exchanges such as Betfair actually reflects the reality is that people and models place odds based on many sources of information. For example, if Messi is injured that will affect how you see Argentina’s chances in an upcoming match. One major weakness with microprediction.org is that it is nearly impossible to supply any information about what it is that the algorithm should predict — Is this the Apple stock price or is it the temperature at Time Square? The only information available that could give you any clue as to what you are predicting is the name of the stream, and it is very limited what kind of information you are able to convey through that. I have also been unable to find a way to supply additional information in the form of time-series. However, these things are of course all easily fixable and, though limiting the service’s usefulness now, do not take anything away from the underlying philosophy and idea.

So, the question remains, will creating a marketplace for competing algorithms be a great alternative to all the efforts currently being put into autoML development? I certainly don’t know. It seems clear that a market place needs to come with monetary incentives and will for sure not be void of costs, but getting autoML to work will also involve considerable compute costs.

All in all, I think it is an elegant idea that deserves a lot more attention than it gets and kudos to Dr. Peter Cotton for a very interesting and elegant initiative.‍