How to choose a machine learning API to build predictive apps

A few things to consider to make it easier to integrate Machine Learning in your apps…

Two years ago, Mike Gualtieri of Forrester Research coined the term “predictive applications” and pitched it as the “next big thing in app development”. Today, some people estimate that more than 50% of the apps on a typical smartphone have predictive features. Predictive apps were defined by Gualtieri as “apps that provide the right functionality and content at the right time, for the right person, by continuously learning about them and predicting what they’ll need.” For that, they use Machine Learning (ML) techniques and data.

ML APIs such as the ones provided by Amazon Machine Learning, BigML, Google Prediction API and PredicSis all promise to make it easy for developers to apply ML to data and thus to add predictive features to their apps. While previous libraries and tools were designed for scientists and PhDs, these APIs provide a much needed abstraction layer for developers to integrate ML in real-world apps. Because they don’t have to worry about using algorithms and scaling them on their infrastructure, developers are able to focus on two things:

  • tracking events on the app and collecting usage data
  • querying predictions and integrating them in the app.

Let’s take an example of a predictive feature that a developer may be interested in adding to his app: priority filtering. Data would consist of examples of messages (the inputs of the ML problem) along with their corresponding classes (the outputs: “important” or “normal”). Example data can be analyzed to create a model of the relationships between inputs and outputs, so that when a new input is given (new message), a prediction of the output (importance) can be made thanks to the model. This would be implemented with ML APIs thanks to two methods which profiles look like this:

  • model = create_model(dataset)
  • predicted_output = create_prediction(model, new_input)

A word about terminology: because the focus is not so much on the ML techniques but on making predictions and on creating predictive apps, I think it’s actually better to refer to these APIs as “predictive APIs”. Besides, it is common to say that we train a model, or that the machine learns a model, instead of saying that we create it; we also say that we run the model for creating predictions.

The exact names and profiles of the 2 methods above would vary from one API to the other, but besides that, it’s not obvious how these APIs differ from one another and how to choose the right one based on your apps’ needs. You’ll find below a list of a few things to consider…

Choosing the right type of model

Predictive APIs make it easier to use Machine Learning, but as a consequence they also give limited control over the predictive models that you can use, so it’s important to check this aspect of their offerings.

Descriptive models

For certain applications, you just want to get predictions and you don’t need to know how they were generated — you just want them to be accurate. This might be the case for user-item recommendations, for dynamic pricing (as seen on Amazon), or for spam detection. But for other applications, it’s essential to explain why the prediction was made, and for that you need what’s called a “descriptive model”.

In Priority Inbox for instance, Google thought this was an important feature: you can see an explanation of the prediction when hovering over the importance indicator (something like “this message marked important because…”). When predicting churn, it’s best if your support team knows which customers are at risk of cancelling their subscription and they know why, so that they have more information for taking action and making each customer stay. Another interesting application where explaining predictions is a key requirement is job recommendations by employment agencies: people need to know why a job was recommended to them.

Model export: make predictions anywhere

From the questions I’ve been getting since I started covering the space of predictive APIs, model export seems to be a highly demanded feature. It does feel safer to know that you own your models completely. When that’s the case, you can also run them anywhere: in your app, server-side, or client-side (even offline!).

Accuracy vs speed

The first performance criterion that machine learners think of is accuracy: you want the model to give predictions that are close to what the truth would be. But in the API community, it’s response time that would come to mind first (for both API methods). Depending on your application, one of these will have more importance than the others.

Some APIs allow you to play with a few parameters to trade accuracy vs speed. With BigML you can create “ensembles” of models on a given dataset: training and predictions take longer than with a single model, but the more models there are in your ensemble, the more accuracy you’re likely to get. With Amazon ML, you can tweak settings such as the target size of the model and the number of passes to be made over the data: the bigger these are, the longer model training and predictions will take, but you can expect to gain some accuracy.


Do you need real-time?

There are actually two questions here: do you need real-time model training and do you need real-time predictions? To answer these questions, you first need to figure out:

  • when you’ll be making predictions, how many you’ll have to make, and how long you’ll have for that
  • how often you’ll be training new models and how long you’ll have.

Model training time is usually not critical because in most applications you don’t need to train or update models frequently (i.e. several times per hour). Although it’s nice to have predictions returned to you very quickly, you should consider whether you can query them in advance of when the user will actually see them. For instance, predictions could be made and stored when the user is not active, so that when he returns to the app they have already been integrated (think about product recommendations, or priority detection on email clients, or the feature on your smartphone that predicts your next destination). This would give you more time, and because of the accuracy vs speed trade-off, it’s a good idea to use up all the time available so you can improve accuracy.

One example where you would need real-time predictions is detecting fraudulent transactions: anything slower than real-time will make transactions slower and will impact user experience. Also, apps where predictions are made in response to user interactions will need real-time predictions. Imagine for instance an app for home sellers that would let you see dynamically how changing the price of your property would affect the (predicted) time it will take to sell it.

Batch mode

Whenever possible, it’s a good idea to make predictions in batch, so that you will only perform one API request instead of several, and you will have less network overhead. Applications in marketing are a good example: in churn prediction or lead scoring you would typically make predictions for all your customers in one go (overnight). So if you plan to do predictions in batch, make sure that the API has a “batch mode”!

Reducing app — predictive platform lag

When calling an API served by another machine, one thing that would come into consideration in your choice of a predictive API provider would be your app hosting platform, as you would gain a few milliseconds by staying on the same network. So if your app is hosted on Google Cloud, it would make sense to consider Google Prediction API first, and if your app is hosted on Amazon you’d want to try Amazon ML or BigML public (hosted on AWS).

The create_model method will be called by your application server, using the application data that the server has access to. But as we said above, it’s unlikely that you’ll be looking at shaving milliseconds off the time it takes to return a model. So any API that allows you to export your model is a good candidate, and you could be running that model somewhere else to get faster predictions. The most natural place to run the model or to call the create_predictions method would be the app server, but the end-user’s client might also make sense in your application.

Performance comparison

In an attempt to compare predictive API providers, I made a 1st comparison of their accuracy and time taken for creating a model and for making predictions. This is far from an actual benchmark but it gives a first insight into how each tends to perform. The following tweet summarizes my results:

Some disclaimers:

  • As someone commented, training time and prediction time might vary depending on the load on the different services, so it would be better to run tests at different times and to average times.
  • This comparison was made on just one dataset, so we’d need to do this on other datasets that have different characteristics: small, big, unbalanced, etc. We can expect to get different results — which happened to me when using a very small dataset (150 rows) where Google Prediction API turned out to be more accurate than BigML. What you need to do is to compare these APIs on your own datasets!

You’ll find some code on Github to automate these comparisons, with only BigML and Google Prediction API at the moment, but hopefully more will follow soon!

Other remarks


Once you’ve estimated how many predictions and models you need to create, you’ve decided where predictions will be made and if they’ll be made in batch or not, you’ll have enough information to estimate the cost of using each API and you’ll be able to take that into account in your choice!

Using several APIs in the same app

Imagine that you would create one model per user in your app — which you would do for Priority Inbox for instance, since importance would not be the same for all (e.g. an email from your significant other won’t be important for me). The size of the dataset you’ll train each model with will vary depending on how long the user has been on your app. If an API performs better on smaller datasets and another one shines on bigger datasets, then you should use both!

Experimenting vs going to production

If an API has a nice GUI or a Studio with a nice interface associated to it, it can help with experimenting until you get the data preparation right and you start getting good results. When you do, you should try other APIs to see how their performance compare as you get ready to deploy into production.

Client-side learning

I think that in the future we will see more and more apps train models client-side in order to ensure privacy (they won’t need to transfer your data off of your phone). It will be useful for apps that need geolocation data as this is something that users are sensitive to. Besides, in order to make better predictions of what functionality and content a user will need at any given time, you need to take into account as much contextual information as possible and to merge different sources of data: location, calendar, contacts, messages… Even though I am personally ok with sharing a bit of each with different apps, I am wary of giving it all to a single app. One way to get users’ trust as an app developer is to ensure that the data never leaves the phone.

Obviously, you won’t be able to use cloud-based predictive APIs for that, but there are promising open source projects such as PredictionIO and Seldon that let you train and run models locally in a way that’s as easy as with those APIs. However, I’m not aware of any Machine Learning SDK for mobile devices with high level APIs such as create_model and create_predictions… but I predict that this will become available within the next year!

All the APIs at the same place

You can learn more about all the APIs and projects mentioned above at PAPIs ‘15, the 2nd International Conference on Predictive APIs and Apps taking place on 6 & 7 August in Sydney. One of the highlights this year is that this will be the first time ever that leaders at Amazon ML, BigML and Google Prediction will all meet on the same stage! See you there!

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.