MLOps — Advocating Better Engineering and Operations in Machine Learning

What is MLOps? Understand how it is going to help you in building an end-to-end Machine Learning Pipeline

Samhita Alla
Feature Stores for ML
6 min readDec 2, 2020


Background Photo from Unsplash

Machine Learning (ML) has forayed into almost all principles of our lives, be it healthcare, finance or education; it’s practically everywhere! There are numerous machine learning engineers and data scientists out there who are well versed in modelling a machine learning algorithm. Nevertheless comes the challenge of deploying a machine learning model in production. Coding a machine learning algorithm is the tip of an iceberg. For a machine learning model to be deployable, configuration, automation, server infrastructure, testing, and process management have to be taken care of. In conventional software engineering, DevOps does the engineering and operations. It bridges development and operations seamlessly. Likewise, MLOps can be applied in machine learning to support rapid deployments in production.

About MLOps

MLOps is to help ensure that machine learning algorithms and systems operate well in production. Besides the typical machine learning algorithm, we require DevOps and Data Engineering to push the machine learning system to production and thereby, operate continuously without any breakages.

DevOps is a set of practices to facilitate building, integrating, testing, and deploying the software reliably. It could be utilized to aim at continuous integration and delivery with minimal overhead of deployment in the development arena. A similar set of principles can be applied in machine learning to build a robust ML system.

MLOps also requires Data Engineering. Data is the bedrock upon which ML is built; data procurement, verification and feature engineering are crucial components whilst developing an ML algorithm.

Altogether, model management, data schema management, and model deployment constitute MLOps.


  • The fundamental challenge in machine learning is the ‘data’ and ‘code’ amalgamation. Data has to be analyzed in and out to make the machine learning system run in production (online).
  • Data scientists typically aren’t into building production-level systems; they primarily focus on data analysis and model development through experimentation, thus demanding the need for expertise in engineering and operations.
  • Clichéd ways of designing and developing algorithms isn’t the machine learning’s forte. Ideally, it involves experimenting with algorithms, features and hyperparameters. Filtering reliable results by various comparisons is quite essential.
  • A machine learning model has to be tested effectively to ensure its robustness in the presence of credible data values.
  • Deployment of an online machine learning model requires training on the web. Hence, effective strategies in validating results are to be in place.
  • Due to the dependency on the ever-changing data being pumped into a machine learning model, constructive monitoring of the performance numbers has to be put into practice.
  • Reproducibility is often difficult to achieve in machine learning owing to the diverse software environments, versions (release iterations) and data. Yet, it is a vital factor which helps in the continuous integration and delivery cycle to proceed smoothly.

Overcoming the above-mentioned challenges is significant in constructing an efficient machine learning system.

The Machine Learning Pipeline

From data procurement to monitoring a model, the steps involved to deliver a machine learning model in production are as follows:

  • Procuring the Data: Selecting relevant data sources to extract the necessary data points to be used by the ML model.
  • Analyzing the Data: Examining the data deeply to understand the diverse attributes and parameters. Understanding the data schema and devising data preparation and transformation strategies.
  • Data Preparation: Feature engineering constitutes data preparation. It involves data cleaning, removing data skewness, data transformation and data reduction.
  • Designing a Model: Selecting a set of suitable models that could be utilized in defining the algorithm.
  • Training a Model: Tweaking hyperparameters to achieve high performance.
  • Evaluating a Model: Testing the model’s accuracy on a test data set.
  • Validating a Model: Finalizing the baseline model to be used in production.
  • Serving a Model: Deploying the machine learning system in a suitable environment by setting up the infrastructure.
  • Monitoring a Model: Logging the statistics (performance, latency, traffic, errors) of the model. If there’s a degradation in the performance at any point of time, deciding on the required step to be undertaken.

Each of the above steps can be manual or automated. Again, manual MLOps has its disadvantages.

Manual MLOps

In manual MLOps, every step has to be manually initiated and executed. This can slow down the process and consume a lot more time. Moreover, data scientists who build the model might not be in sync with the engineers who serve the model which could in turn lead to a training-serving skew. Training-serving skew is the difference in performance attained during training and serving a model. Data scientists who construct the model could hand over the model’s algorithm and parameters to the engineer who deploys it on the target system. This could lead to differences in the model’s performance.

Also, model’s versioning doesn’t happen in an automated fashion which would create discrepancies in model’s reproducibility.

No Continuous Integration/Continuous Deployment (CI/CD) is ensured as the model is assumed to be static in a general sense where no frequent changes are introduced.

Monitoring the performance of a model doesn’t happen all the time which could lead to overlooking the model’s degradation statistics.

A Step Ahead — Automated MLOps

To overcome the above-mentioned challenges, we need to automate the tasks constituting the ML pipeline. To do so, data validation, model validation and continuous monitoring have to be embedded into the pipeline.

  • Transitioning from one phase to the other has to happen automatically, using automated strategies.
  • Data has to be validated properly to identify the skewness in it. If the data is in accordance with the expected schema, the next phase can be triggered, else, manual intervention is required.
  • The model should be trainable online using a fresh batch of data. It should be able to compare the current metrics with the previous metrics (baseline model) and choose the best model to be moved to production.
  • The development environment has to be available in the production stage to retrain and test the model.
  • Modularity has to be induced into the code for better sustainability of the machine learning system.
  • Monitoring the machine learning system has to happen at both, micro and macro level.
  • Additional components include Feature store and Metadata Management. Feature store includes the feature related data collected throughout the user’s interaction. It can be reused for training and serving. Metadata Management is the storage of metadata pertaining to the versions, time consumed, hyperparameters and statistics. This helps in easier anomaly detection.

On the whole, in the automated version, rather than integrating a trained model, the whole machine learning pipeline (training + serving + monitoring) has to be pushed to production.

CI/CD Orchestration

Continuous integration and delivery helps in reliable updates of the machine learning pipeline in production. It ties the components together lucidly. Continuous integrations of the source code alongside A/B test, and continuous deliveries to the production environment by verifying the system’s compatibility with the production infrastructure shall result in sophisticated feature engineering, model construction and validation.

Use Cases

  • Manual MLOps (MLOps Level 0) could be utilized by all non-tech companies which have static ML models wherein the model modifications happen rarely.
  • Automated MLOps (MLOps Level 1) is useful when there’s continuous training to happen in the production environment. Modifications come by and training has to happen online.
  • CI/CD MLOps (MLOps Level 2) is useful when the updates come by regularly and a huge infrastructure is underlying the machine learning system.

MLOps Tools

If you want to put MLOps into practice, here are some nice tools.


A full-fledged machine learning pipeline isn’t just about training and evaluating a model — it involves deploying and retraining the model online as well. The deployment strategy you choose must rely on the costs you’re willing to incur, tenure of the product you’re planning to build, size of your firm, headcount, expertise and the productivity you would want to generate. Only the best MLOps strategy befitting your company’s needs can generate fruitful results.

Thanks for reading!

References: Google Cloud