MLOps: The 4-Pipeline Batch Architecture

mic
MLOps Republic
Published in
3 min readJan 18, 2024

Hi there!

Every week I share some content about MLOps. I hope you enjoy it!

This is a very short introduction to the batch inference ML system architecture. To keep things simple, I’ll suppose the data sources are also batch (in a future post I will evolve this architecture to deal with real-time data sources).

The diagram below, shows the batch architecture.

Simple diagram of the ML batch architecture.

In this architecture I highlight 2 points:

  • 4 pipelines and
  • 2 key elements

Pipelines

There are 4 main pipelines conforming this architecture, namely the feature engineering pipeline, the training/re-training pipeline, the inference pipeline and the ML monitoring pipeline.

  • The feature engineering pipeline ingests raw data from one or multiple data sources and transforms it into valuable features for the model.
  • The training/retraining pipelines objective is to take the features from the feature engineering pipeline and train a machine learning model. The output of these pipelines is usually a model artifact and its configuration (hyperparameters, etc.). Both training and retraining pipelines are very similar (sometimes they are referred as the same pipeline actually). While the training pipeline is usually manually used in the development stage, experimenting with different model architectures and configurations. The retraining pipeline is used in an automatic fashion and it takes the model produced in the training pipeline and retrains it with newer and fresher data.
  • The inference pipeline takes as input the features and model artifacts and produces the predictions (in batch mode) and saves them into a storage from where different users and or apps can consume.
  • The ML monitoring pipeline continuously oversees the performance and/or drift of the model. The output of this pipeline are usually performance and drift measures which can be then displayed in a dashboard, trigger some alarms and also used to trigger the retraining pipeline if the model needs to be updated.

Now, the question is: How do these 4 pipelines work together?

Well, the answer is “the 2-key components” namely the feature store and the ML platform.

2 key elements

The 2 elements that glue together the 4 pipelines are the feature store and the ML platform.

  • The feature store is the central place where the features of the model are stored and consumed, and it decouples the 4-pipelines in terms of data. That is, the feature engineering pipeline takes data from data sources and stores features into the feature store. This can happen every hour for instance. Then, the inference pipeline takes features from the feature store (those left by the feature engineering pipeline) and produces the predictions. This can happen once a day for example. Similarly, the rest of the pipelines make use of the features stored in the feature store, thus decoupling them from the data perspective.
  • The ML platform is where (simplifying) both the model artifacts and their associated metadata (i.e. configurations, charts, etc… ) are stored and versioned. It is composed mainly by a model registry (where the model artifacts are stored) and a experiment tracking engine (where the different experiment results are stored so that we can compare an track them easily). The ML platform also serves to decouple different pipelines. For instance, the inference pipeline (which may run once a day), takes the latest model tagged as “production” from the model registry and generates the predictions. Similarly the retraining pipeline (which could be executed only once a week for instance), takes the current model retrains it and stores the new retrained artifact (and associated metadata) to the model registry (and experiment tracking). Eventually, the retraining pipeline can also promote the newly retrained model to “production” so that the next step the inference pipeline runs it will take the retrained model. This is where the pipelines are decoupled.

I’m preparing a thorough and extended version of this post for you to dig deeper into the batch inference ML system architecture which I will release soon! Stay tuned!

Let me know if you liked it and leave comments, feedback is welcome!

--

--

mic
MLOps Republic

I write about Python and MLOps. Principal ML Engineer @ADP.