Machine Learning LifeCycle

6 min readOct 14, 2023

Watching caterpillars turning into butterflies…

In my previous article, I talked about AI Introduction which would be worth going through to build up better on this article.

In this article we are going to explore an End-to-End Machine Learning model’s LifeCycle. Again this is a part of a series, where passing on the knowledge is more important. If you like this article give it a 👏 or share it with other social media.

“Predicting the future isn’t magic, it’s Artificial Intelligence.” ~Dave Waters

This is an article to outline and understand what are the components involved in an End to End Machine Learning (ML) LifeCycle as a part of our learning objective.

So let’s keep exploring our journey together.

ML LifeCycle Components
Further Reading

1. ML LifeCycle Components

The Machine Learning LifeCycle is a cyclical iterative process as mentioned below with components involved which add clarity and structure to the process.

1.1 Business Goal

There should be a clear Business Goal defined for an ML problem which could be measured against specific business objective and success criteria. It should define how Machine Learning will be able to solve the problem by providing business value.

Steps to follow:

1.1.1 Business considerations

Gather and understand business requirements/objectives such as follows:

Target Marketing
Risk & Fraud Management
Strategy Implementation and Change Management
Operational Efficiency
Increase Customer Experience
Manage Marketing Campaigns
Forecast Revenue or Loss
Workforce Management
Financial Modeling
Churn Management, and so on …

1.1.2 Involve all stakeholders early on such as:

Business Managers, Product Managers, Risk Managers, Marketing Managers
Operation Managers, IT Managers, Data Managers, Change Managers
Business Analysts, Data Scientist, Consultants
IT Developers, Information Security Risk Managers
Independent Reviewers
Internal and External Auditors

1.1.3 Identify business questions:

Whether understanding customer characteristics is a primary need?
Or to make customers profitable?
To understand how sales is driven? Or Increase Sales?
Win-back lost customers? Or Reduce customer churn?
Reduce production cost? Or operation cost?
Identify possible default customers?
Identify cross-sell or Upsell opportunities?

1.1.4 Identify the most important and must have Target Opportunities:

Evaluating if a new business process is applicable.
Evaluate the clearly defined business value that the ML will be achieving using clearly defined business metrics.

1.2. Define Business Goal into ML problem:

Here the above Business goal identified with a problem is described into a Machine learning problem: what is observed vs what is expected or what should be predicted (also known as target variable/output/label). Deciding on what to predict, performance criteria and evaluation metrics and optimization is key to this step of the life cycle.

1.2.1 Define the ML task to achieve the business goal based on business questions:

Profile Analysis
Segmentations
Response Modeling
Risk Modeling
Activation
Cross-Sell and Upsell
Attrition/Churn Modeling
Net Present Value(NPV)
Customer Life-Time Value (CLTV)

1.2.2 Review established work in the same domain (if applicable).

Design Proof of Concepts where no existing work is applicable (for an unknown domain).

1.2.3 Identify optimization goal

Identify key performance metrics for business such as leveraging new business acquisition, anomaly detection rate and so on.
Review Data Requirements
Evaluate cost and performance optimization
Evaluate against data acquisition cost, training the model, prediction and wrong inference of the model.
Evaluate against external data sources and model performance.
Production Evaluation
Monitor and mitigate ML generated errors.

Steps to follow for this phase:

Establish success criteria for the problem or project.
Identify the quantifiable performance metric for the project. Example- Error rate or accuracy score.
Establish correlation between business objectives and technical metrics along with technical outcome in regards to business outcome (example model accuracy score vs decline in anomaly rate for a business).
Define ML questions for inputs (features), desired targets (outputs), performance metrics that require optimization.
Identify if ML is the right solution.
Establish tactics to acquire data and data annotation objectives.
Establish a Baseline model and iterate on it with improvements in performance, error rate, optimization techniques and complexity of model.

1.3 Data Processing

This step is composed of 3 parts:

Data Gathering/Collection — Here as the name suggests, the data if unavailable is collected via extrinsic and/or intrinsic means.
Data Preprocessing — Here the missing values and outlier values are nullified/restored by common values or filled in with extreme values as a substitute to help model predict accurately.
Data Transformation — Also known as Feature Engineering. Here some of the features (also known as inputs) are transformed with other features so that normalized data could be achieved and the model gets consistent input values.

1.4 Model Development

In this phase, the model is built, trained, tuned, and evaluated. A CI/CD pipeline helps with automated building, training and deployment on stage and production environments.

Following diagram explains the training and tuning process at this phase:

Model Parallel means splitting a model into multiple instances or nodes.
Data Parallel means splitting data into mini-batches.
Debugging includes system bottlenecks, overfitting, saturation of activation functions, exploding and vanishing gradients.
Validation metrics depend on outlined business problems based on which ML model is selected and varies for each problem set.
Hyperparameter tuning refers to tuning of variables that are internal to the model and control the models output.
Model Evaluation is based on performance metrics and success criteria defined during the ML problem framing phase.

1.5 Model Deployment

Deployment strategies are important to make sure the user experience is seamless and at the same time improved. Moreover it should have disaster management plans such as fallback strategy, constant monitoring, anomaly detection and minimizing losses for the model.

At this phase, the model is evaluated and after validation, it is deployed to the production environment where the predictions take place in the real world.

The above diagrams define a model deployed using CI/CD pipeline.

Following are the testing and deployment strategies used to reduce downtime and risks during model updation:

Blue/Green deployment technique — where two identical infrastructure exists. Blue is where the existing infrastructure lives and Green is where the testing infrastructure lives. Live traffic is directed from Blue to Green once testing is done on the Green environment.
Canary deployment is the process where new features are released to a subset of people and others continue to use previous versions.
A/B testing enables deploying changes to the model. In this method a set of users is directed to a new environment and a set of people are directed to an old environment. Here the timescale is relatively longer than the canary method.
Shadow deployment method is where input data is run on both the systems and the old version is used for production backup while the new version is used for testing.
Inference pipeline — Performs automated capturing of inference in real-time or using batch-processing on prepared data, using predictions and post -processing.
Scheduler pipeline — It initiates retraining defined by business timeline.

1.6 Model Monitoring

In this phase, the model is monitored and maintained for desired performance via early detection and mitigation. Here monitoring also means explaining how sound the model is and whether the predictions can be trusted.

The issues detected during the monitoring phase include data quality, model quality, bias deviance, feature attribution drift. Also during this phase an alert messenger identifies the drift and launches an update model pipeline to retrain the model.

Don’t forget, if you like this article give it a 👏 or share it with other social media.

2. References

[Amazon] — Machine Learning LifeCycle Phases.
[DataCamp] — Machine learning life cycle explained.
[Medium]- Parameters and hyperparameters.
[Microsoft] — Data Science Life Cycle
[rstudio-pubs-static] — 7 steps to predictive machine learning