MLOps in practice: Automated deployment of ML solution to multi environments.

Jagadish Kakhandaki
Trusted Data Science @ Haleon
6 min readMay 20, 2024

Introduction:

In recent years, Machine Learning (ML) has shifted from almost purely academic applications to becoming an integral part of decision making processes for the most top performing companies in the world. With this shift to industry applications, came the necessity for more reliable deployment and operation of machine learning models creating a new discipline — MLOps. As a senior Machine Learning Engineer at Haleon, I am really passionate about creating robust ML pipelines that ensure smooth and efficient deployment and reliable running of models. I helped develop and deploy some of the key ML applications at Haleon, and here I will share with you how we do it to deliver Machine Learning projects for our users.By incorporating software engineering practices, MLOps streamlines the transition from developing ML models (data science) to their successful deployment and operation in production (software engineering). This approach improves efficiency and ensures smooth model integration into real-world applications.

By incorporating software engineering practices, MLOps streamlines the transition from developing ML models (data science) to their successful deployment and operation in production (software engineering). This approach improves efficiency and ensures smooth model integration into real-world applications.

At Haleon, we are solving problems using AI/ML which includes predictive analytics, image recognition, generative AI and much more. All these applications are -built as global data products that consistently help Haleon make better decisions. This is why we need to make sure that we build infrastructure to reliably support these applications and ensure their scalability. To achieve this, we have built a multi environment setup of MLOps with best practices from software engineering, such as version control, continuous integration, deployment (CI/CD), automated testing, monitoring, and model versioning to cultivate clean, maintainable, and reproducible data science code and machine learning models.

Architecture overview:

Whole MLOps architecture is designed in Azure cloud. We followed deploy code workflow.

MLOps architecture for multi environment deployment
MLOps architecture for multi environment deployment.

Below are the key components:

  1. Code version control: GitHub is used for code version control. Branches are created and protected for each environment deployment.
  2. Azure Data Lake Gen2: Data from multiple sources are processed through Data Engineering pipeline and published in Azure Data Lake Gen2 as optimised layer for ML consumption. Outputs from MLOps pipelines are also pushed back to data lake for downstream applications to consume.
  3. Azure ML studio: Compute instances from Azure ML studio are used by data scientists for development work and then MLops pipelines are run in ML studio and deployed through CI/CD pipelines. MLflow is used for experiment tracking, logging and model registry within Azure ML.
  4. Azure container registry: Docker images are created with source code and environment to run the code and pushed to ACR(Azure container registry) as part of CI pipeline. Later this image from ACR(with code commit versions as tag) is used for creating Azure ML environment to run MLOps pipelines.
  5. CI/CD pipeline: GHA(Github action) is used for running CI/CD pipelines. CI pipeline runs unit tests and integration tests, builds docker image and pushes it to ACR. CD pipeline creates MLOps pipelines using Azure ML sdk, pushes them and runs them in Azure ML studio.

Different environments and their purpose:

Devtest: In the devtest environment, data scientists and ML engineers can collaborate on all pipelines in an ML project, committing their changes to source control. Once the PR is created from feature branch to develop branch, CI pipeline runs. ML engineers help configuring environment, compute resources and CI pipelines. Both ML engineers and Data scientists write unit tests which are run as part of CI pipeline.

Once PR is merged to dev branch, all the MLOps pipelines (training, evaluation and inferencing pipelines) run in devtest environment through CD pipeline. A successful run makes the dev branch ready for UAT environment.

UAT: UAT environment facilitates the code transition from dev-test to prod. In here, MLops pipelines are run with production data, and results are made available for users to test. Once users are satisfied with the results, the code is ready for deployment to production.

Production: The production environment is typically managed by a select set of ML engineers and is where ML pipelines directly serve the business or application. In production, model artifacts, pipelines, and predictions are registered as assets in the governance tool and predictions are published to downstream tables and/or applications. And entire process is monitored to avoid performance degradation and instability.

Best practices used for reducing tech debts and improving scalability

Code standards and pre-commit checks: To enhance the code quality and readability, a number of pre-comit checks like flake8, pydocstyle, black, isort and mypy are added to repo. These checks run as part of the CI pipeline making sure that any committed code follows our defined standards.

Unit tests and integration tests: Unit tests ensure individual parts of your ML code (functions, classes) work as expected. Integration tests verify how these parts work together as a whole. Both are crucial for catching errors early, improving ML project reliability and maintainability.

PR review process: It provides a second pair of eyes to catch errors in code or analysis, preventing bugs from slipping into production. Reviewers can also suggest best practices and ensure the changes align with project goals, fostering knowledge sharing within the team. A clear and concise pull request description outlining the changes and their rationale is essential to maximize the effectiveness of peer reviews.

We have set minimum of 2 peer review approvals and passing of CI pipeline is prerequisite for PR merge.

Yaml configuration files: These files centralise configuration details, keeping them separate from code. This simplifies updates and experiments, as you modify values in the YAML file without touching the code itself. This promotes better code organisation and maintainability. YAML’s simplicity also makes it easy to version control alongside your code, enabling easy tracking of configuration changes.

In our projects, we used files where infra defined parameters like storage account and infra-independent values such as model hyperparameters.

Logging and monitoring: Logging captures key events and metrics during your data science project, while monitoring actively tracks these logs and performance indicators to identify issues and ensure smooth operation.

We use the standard Python logging library to handle logging .Using OpenCensus Python SDK, logs are exported to Azure Monitor and through Azure Application Insights, monitoring is enabled for model key metrics.

Cloud cost management: Auto scaled clusters are used for running MLOps pipelines. Personal Dev compute provisioned for Data Science team are enabled with auto-stopping based on inactivity.

Vulnerability scan: As we are using open source libraries in our code base, it is essential to scan the code for vulnerability to ensure security. Any vulnerabilities found are mitigated according to our SLAs.

Following a similar approach, we also scan our Docker images for underlying operating system vulnerabilities so assure security.

Documentation: Clear documentation is vital for data science projects. It explains the project’s purpose, methods, and results, enabling others to understand the work, reproduce findings, and ensure the project’s long-term value.

We ensure to document what the code does, how it works and why it is written that way. We also document developer setup and track known issues. All of these activities help our teams to manage onboarding as well as understand and maintain the applications, ultimately saving time and reducing errors.

Infrastructure documentation is also created to capture details about project’s technical setup. It outlines environment configuration, access control and deployment process.

Conclusion

This architecture serves as a reusable blueprint for deploying machine learning projects. It incorporates best practices, streamlining the development and deployment process for bringing your ML solution to real-world business applications. Furthermore, this architecture acts as a foundational template that can be adapted for various use cases, such as streaming inference or deployments on different cloud providers. Additionally, you can seamlessly integrate advanced features like a feature store to enhance the functionality of your ML pipeline.

--

--