Published in


Why your models might not work after Covid-19

An introduction to concept and data drift.

Why would our models start failing? What does he mean by “drift”? How do we know if we have to retrain? I’ll try and answer these questions in this blog post, as well as how we can help.

Concept Drift

Concept drift is more likely to occur in domains of high complexity, especially when they involve people’s behavior. If you use ML for image recognition, chances are you won’t have to deal with concept drift very much (unless there are changes to the sensor or new kinds of object pop up). On the other hand if you use it for, say, demand forecasting, it would be a good idea to retrain your model with training data that takes the Covid-19 outbreak into account (e.g. by throwing out data before the outbreak).

Business decisions made using models suffering from concept drift are likely to be flawed and impact the bottom line. You may waste money on inappropriate marketing, overstock or understock items, or offer the wrong prices.

Toilet Paper Shortages: an example of concept drift.

Below is a (fictional) illustration, based on a moving average forecast. You can see how the old data leads to much greater error than usual.

The spike in demand for toilet paper is a great example of concept drift. A naive forecasting model will assuming that this rise in demand is permanent.

Data Drift

To recognise data drift, log and analyse incoming data. For example, if column averages change drastically, chances are that something is going on with the way data are collected.

Another thing that can happen is that a model is set up to pull some of its data from external sources, but one of the sources is no longer operational or has changed in nature. This would result in a much larger fraction of missing data than the model is set up to cope with. Again this would show up in logs.

To remedy concept drift and data drift once it’s detected, you will have to retrain your model on data generated by the new process.

Fighting Drift

  1. Log inputs. This is to prevent data drift. A model may be trained for data of one kind, and then start behaving weirdly when it encounters data that was completely out of scope during training.
  2. Log model predictions. If your models start delivering very different outputs than was historically the case, chances are that the domain has changed and you may be subject to concept drift.
  3. Log model accuracy. For some applications of machine learning (e.g. forecasting), you may be able to examine predictive accuracy directly. If predictive accuracy starts decreasing substantially, chances are you have concept drift.
  4. Log upstream behaviour. If your model matters, it will have an effect on the real world. For example a recommendation system will affect consumer behavior. If this behavior suddenly changes, then your model may no longer be appropriate.

When do I retrain?

  1. Continual retraining, or
  2. Retraining after an alert is triggered.

To continually retrain a model, you need a machine learning pipeline. This will automate the collection and preprocessing of training data, model training, cross validation and deployment.

If you decide to retrain only when an alert is triggered, you need to find appropriate thresholds. The idea is to view the problem as one of quality control, we can view concept drift as being analogous to a malfunctioning machine part, which would produce greater than usual variance.

A product manager or domain expert can help decide an acceptable model accuracy threshold. If, on the other hand, you alert based on changes in a metric, a starting threshold could be 4 standard deviations away from historical average. If metrics are collected daily, then you would expect a 4 standard deviation change to only happen about once every 43 years.

Which data should I retrain on?

There are three different approaches for handling this:

  1. Exclude all training data prior to concept drift, this would result in much less training data available and is only viable in the “dramatic event” case.
  2. Weighting your training data by recency, so that the most recent data is weighted the most heavily.
  3. Creating a new model that explicitly corrects for the source of concept drift. An example of this would be forecasting models that allow for different behavior during holidays.

Getting help



AI & Machine Learning in Melbourne, Australia

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store