MLOps — How To Monitor Data Drift in Machine Learning Models

Puneet Saha
AllThingData
Published in
3 min readSep 12, 2023

Deploying machine learning models comes with its unique set of challenges. In this article, we will delve into one of the challenges from a software engineering perspective, focusing on the critical aspect of monitoring. It’s important to note that successful model deployment requires a collaborative effort between engineers and data scientists. Engineers need a high-level understanding of the problems and potential solutions involved in deploying models effectively. Before exploring the methods of detecting data drift, it’s crucial to understand the importance of monitoring these drifts, their impact on Prediction Bias, and the nature of Prediction Bias itself.

Prediction Bias

Once the model is deployed one of the problems is about monitoring the model. Apart from standard sets of signals — compute, memory, network, storage, latencies, etc of the hosts where models are deployed — there are a bunch of other signals to be monitored. One of the key aspects to monitor is prediction bias.

Prediction Bias is a systematic error in predictions made by a model. It occurs when the predictions consistently deviate from the actual values in a particular direction. This deviation can be caused by various factors, including the choice of features, the model’s architecture, or the data used for training. Sometimes data changes over time. A phenomenon known as data drift is about how input data changes over time due to certain external factors.

Let’s take a look at an example of prediction bias in housing price prediction.

Imagine you are building a machine learning model to predict house prices based on features such as square footage, number of bedrooms, and location. You collect a dataset of houses and their actual selling prices. In this scenario, prediction bias can occur if your model consistently underestimates the actual selling prices of houses. For instance, if your model predicts that houses will sell for $200,000 on average but, in reality, they sell for $250,000 on average, you have a prediction bias towards underestimation.

Data Drift

Data drift refers to the phenomenon where the statistical properties and distribution of the input data used to train a machine learning model change over time. When the model encounters new and unseen data introducing changes to the data distribution, the model’s performance and predictions can be negatively impacted.

Black Arrows = PreProcessing, Green Arrows = Data Drift Monitoring = Blue Arrows = Model Quality Monitoring.

Statistic metrics such as mean, median, max, and min can be used to measure data drift, and we can use training data to generate baseline statistics and constraints. Constraints are the rules and thresholds for these stats and other input fields. Some constraints for data drift include:

  • Ensuring that numeric values don’t deviate beyond specified thresholds.
  • Verifying that data isn’t missing or in violation of data types.
  • Any other custom metrics/constraints based on domain expertise

After the model is deployed, we capture and store the input and inference generated by the model. Then, a scheduled job, aka the data drift monitoring job, will calculate metrics based on the input captured from live inference requests, and compare them with the baseline constraints. The comparison results will trigger alerts if any violation happens.

Other Model Quality Metrics

Besides data drift, there are other model quality metrics that are crucial to ensure accurate predictions: e.g., accuracy, precision, recall, and RMSE (Root Mean Square Error). We can monitor these metrics in the same approach as the data drift monitoring job described earlier.

Conclusion

Monitoring plays a critical role in ensuring the reliability and accuracy of deployed models. In this article, we have high-level considerations and steps for monitoring prediction bias and data drift in deployed machine learning models. Various open-source libraries and managed services, such as MLflow, AWS SageMaker, TensorFlow TFX, and Uber Michelangelo, can assist in implementing these monitoring practices.

I hope you found this helpful! As a new writer, I welcome your honest feedback to help improve and make my articles more digestible and useful. If you liked what you read, your claps are more than just appreciation — they’re a motivating force that inspires me to share more insightful content.

--

--