Characteristics of ML CI/CD Pipelines

Manav
Aruva.io Tech
Published in
2 min readApr 3, 2021

In this blog post, we will be going through the components of building and maintaining CI/CD pipelines for Machine Learning

Photo by Arnold Francisca on Unsplash

A typical CI/CD pipeline is composed of:

[a] Build Phase / Pipeline which results in creation of ML artifact

1. Build the artifact
2. Persist the artifact
3. Sanity check / Smoke testing
4. Generate explainability report

[b] Deploy to test environment

1. Manual Validation of artifact
2. Execution of performance tests (computational, validation, etc)

[c] Deploy to Production environment

1.  Canary or blue-green deployment
2. Full deployment
3. Release deployment

ML artifact

An ML artifact is comprised on the following:

  • model code and pre-processing logic
  • Hyperparameters and configurations
  • Trained runnable model
  • Environment variables (libraries, versions, environment variables etc)
  • Documentation
  • Code and data for validation

Considerations for Test Deployment

When deploying a model in test environments, ensure completeness of test cases. Additionally, the test cases should not only provide coverage of validation but also should be able to identify the source of failures.

Additionally, there are 2 modes of deployment relevant:
i. Batch scoring mode where entire data sets are processes like daily batches
ii. Real-time scoring mode where the data set is limited, curated sub-set with coverage across all required validation scenarios

Considerations for Production Deployments and Release

When deploying in production, we need to consider a few basic permutations

  1. Single model, Singe version deployed on Single server
  2. Single model, Single version deployed on Multiple servers
  3. Single model, Multiple versions deployed on Single server
  4. Single model, Multiple versions deployed on Multiple Servers
  5. Multiple models, Multiple versions deployed on Multiple Servers

Additionally, deployment methodology could be Canary vs Blue-green depending on the nature of the model and application encapsulating the model

Consider the above factors in your enterprise MLOps practice and CI/CD machine learning pipelines

--

--

Manav
Aruva.io Tech

We build world-class accelerators for businesses to take their idea from conceptualization to reality