Monitor and Update your models

Deploying your model is only the beginning…

Malo Le Goff
CodeX

--

After a long development stage where you had to deal with feature engineering, feature selection, regularization, and so on and so forth, you’ve finally deployed your awesome machine learning model in your production environment.

But after a few weeks, you’ll realize that your model is out of control, its predictions make no sense whatsoever. What’s happening? How can you solve it? We’re going to answer these questions in this article

Photo by Marius Masalar on Unsplash

I. Model Deprecation

All models need to be updated eventually to account for changes in the external world

Indeed, the performance of your model might decrease because of 2 factors :

  • Data Drift: It can be described as a change in the distribution of data. For instance, a change in the standard deviation, the mean, the median, … Statistical tests like Kolmogorov-Smirnov can be used to identify the data drift
  • Concept Drift: The relation between the variables and the output changes. Indeed, patterns and relations in data often evolve over time, and models built for analyzing certain initial data quickly become obsolete over time as the relationship between input and output variables changes

So once deployed, your model will become more and more deprecated. This phenomenon is known as the Model Drift. And that’s why you must retrain your model regularly so it stays sharp. A good way of doing this is by implementing a Continuous Training Pipeline.

Continuous Training (CT) : It’s basically a pipeline that automatically retrains your deployed model with data coming from your production environment. Once trained, the new model is served to replace the older model. It’s done without bringing the service down or interrupting its operations

To implement a CT pipeline you need first to monitor your model. Afterward, you need to set up when does your model get updated and under what conditions

II. Model Monitoring

The first step of the implementation of the CT pipeline: the monitoring of your model.

It’s basically setting up the code needed so you can easily see the effects of the Data and Concept drifts and how they evolve with time.

So, what do we have to monitor?

  • The Data: Data Distribution (average, standard deviation, quantiles, and so on), data’s schema, …
  • The Model: Metrics of the model (accuracy, precision, recall, …), latency, training time, …

You can monitor them by doing some tests when your model is running and logging the results. You could then put your logs in a fileset (set of computer files linked by defining property or common characteristic) or an ElasticSearch where you can analyze them later on.

Now that’s monitored, you must find a way to alert your team when one of the 2 surveillance axis (Data or Model) has gone red. It could be via mail for instance

At that point, you have a clear system to monitor and alert your team if our model goes out of control. So now, you must set up the update mechanism

III. Model Update

By “setting up the update mechanism”, I mean decide when and under what conditions do my model gets updated.

The first step is to decide when the model gets updated. One can say that you must decide which trigger to use :

  • On-demand: Ad-hoc manual execution of the re-training
  • On a schedule: If new data is systematically available for the ML system on a regular basis, you can retrain on the same basis.
  • On model performance degradation: Now that you’ve monitored your model, you can decide the smallest precision (or whatever metrics you use) you accept before re-training the model
  • On significant changes in the data distribution: Now that you’ve monitored your model, you can decide the biggest change in the data you accept before re-training the model

NB : The monitoring of your model is necessary only for the 2 last triggers. Even though it’s not the case for the 2 first ones, the monitoring still might be convenient at least to have an observable system

The re-training frequency also depends on :

  • How expensive it is to re-train your model: The more expensive it is, the less you’ll re-train it
  • How quick your model gets deprecated: The more unstable your model is, the more you’ll re-train it. For instance, if the variables are very dynamic (change a lot with time), a small change to this variable can affect the model to a great extent.

Now that you have decided when you re-trained your model, you must decide whether you want to replace your old model with the new one. For that, the new system/model must pass some conditions :

  • Data Validation: Test if the data schema has changed, if the data is still representative, … If the data is not validated, then the model is not even re-trained
  • Model Validation: Compare the new model with the new one in terms of accuracy (or whatever metrics you use to evaluate your model). If the comparison is acceptable according to your requirements, replace the models

If both of these conditions are met, your old model can be replaced by the updated one. If not, it may be a good idea to have data scientists/businesses review the explanation of the new model before its promoted to production. You might have to introduce new feature engineering steps or something like that. So you’d have to go back to the development stage

NB : You could also release your new model so it serves only a part of your users at a time, the rest being served by your old model. The new model takes over more and more users until it completely replaced the previous model

Conclusion

Now you have a reliable and maintainable ML system in production thanks to the CT pipeline! To put a buzzword on what I just described, it’s one of the practices of MLOps.

MLOps : Set of practices used to make a ML model maintainable and reliable in production. So monitoring and updating your model via a CT (Continuous Training) pipeline belongs to the MLOps practices

I hope this article was useful for you. Thanks for reading!

References :

[1] : https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

[2] : https://valohai.com/model-monitoring/

[3] : https://www.datarobot.com/wiki/model-monitoring/

--

--

Malo Le Goff
CodeX
Writer for

Student Engineer | Engineering school : IMT Atlantique | Software Engineering & Data Science & Cybersecurity