MLOps End-To-End Machine Learning Pipeline-CICD

Senthil E
Analytics Vidhya
13 min readJul 5, 2021

--

The main objective of this project is to automate the whole machine learning app deployment process. To implement this project the person needs some understanding of TensorFlow and basic knowledge in dockers and Kubernetes. If you want to know more about Docker, Kubernetes, and cloudbuild then refer to the links I have given. Let's dive in.

Contents:

  1. About the Dataset
  2. Model Development Steps
  3. Model Deployment and CICD Steps

What is MLops?

According to Google documentation

👉🏻 MLOps is a methodology for ML engineering that unifies ML system development (the ML element) with ML system operations (the Ops element). It advocates formalizing and (when beneficial) automating critical steps of ML system construction. MLOps provides a set of standardized processes and technology capabilities for building, deploying, and operationalizing ML systems rapidly and reliably.

MLOps supports ML development and deployment in the way that DevOps and DataOps support application engineering and data engineering (analytics). The difference is that when you deploy a web service, you care about resilience, queries per second, load balancing, and so on. When you deploy an ML model, you also need to worry about changes in the data, changes in the model, users trying to game the system, and so on. This is what MLOps is about.

1. About the Dataset:

This dataset was initially published by analyticsvidhya.com

and also available in Kaggle.

This dataset contains around 25k images of size 150x150 distributed under 6 categories.
{‘buildings’ -> 0,
‘forest’ -> 1,
‘glacier’ -> 2,
‘mountain’ -> 3,
‘sea’ -> 4,
‘street’ -> 5 }

The Train, Test, and Prediction data is separated in each zip file. There are around 14k images in Train, 3k in Test, and 7k in Prediction.

2. Model Development Steps

Image credit — TensorFlow documentation

I am not going into detail on the model development. I am using TensorFlow for this image classification problem. Here the objective is to build the model and automate the deployment process.

  • Unstructured data
  • Image classification -Multiclass
  • Use TensorFlow library
  • Upload the dataset to a dataframe.
  • Explore the dataset.
  • Prepare the data.
  • Data Augmentation — Using ImageDataGenerator
  • CNN classifier
  • Multiclass classification -softmax
  • loss-categorical_crossentropy
  • Optimizer -Adam
Image by the author

Remember to download the model and test again by uploading.

model.save(‘/content/drive/MyDrive/Files/image_intel/models/’, save_format=’tf’)

and upload

model_loaded = tf.keras.models.load_model(‘/content/drive/MyDrive/Files/image_intel/models/models/’)

The saved model folder will look like

Image by the author

3. Model Deployment and CICD Steps

The below are the steps we are going to follow to deploy the model in GCP.

What is CICD?

According to Google documentation

Continuous Integration (CI) and Continuous Delivery (CD) enables teams to adopt automation in building, testing, and deploying software.It will then guide you through the CI/CD pipeline stage to build and deploy an application to GKE using Container Registry and Cloud Build.

We will be doing the following steps.

  1. Github ready: Create all the files needed for the automation and keep the GitHub repository ready.
  2. Cloudbuild: The build will be done by using google cloudbuild.
  3. Testing: No automated testing in this pipeline.
  4. Deploy: We will be deploying it in GKE with 2 replicas.

Github, Cloudbuild, and Deploy in GKE:

The detail steps are

  1. Create a streamlit app -python file.
  2. Create a docker file.
  3. Create the requirements file.
  4. Create the Kubernetes deployment YAML file.
  5. Create the Kubernetes service YAML file.
  6. Create the cloudbuild YAML file.
  7. Create a GitHub repository on the GitHub desktop.
  8. Upload and organize the files on the Github desktop.
  9. Push the files from desktop to Github.
  10. Link the cloudbuild to the Github and the GCP project.
  11. Create a trigger in the GCP -trigger based on the changes in the Github code.
  12. Now the build is triggered and the app is deployed on the Kubernetes engine.

Let's go into details on the above steps.

Now you have the CNN image classification model downloaded and ready to be deployed. We are going to deploy the model in the Google Kubernetes Engine.

According to google cloud documentation -Google Kubernetes Engine (GKE) provides a managed environment for deploying, managing, and scaling your containerized applications using Google infrastructure. The GKE environment consists of multiple machines (specifically, Compute Engine instances) grouped together to form a cluster.

Some of the advantages of using the Kubernetes are

  • Load balancing
  • Automatic scaling
  • Automatic upgrades
  • Node auto repair
  • Logging and monitoring.

If you don’t have the GCP account then you can create one and use the 300$ free credit for the new users by GCP. The details are

New customers also get $300 in free credits to fully explore and conduct an assessment of the Google Cloud Platform. You won’t be charged until you choose to upgrade.

Let's see all the steps in detail

Streamlit:

Streamlit is an open-source app framework for Machine Learning and Data Science teams. Create beautiful data apps in hours, not weeks. All in pure Python​.

Image credit — Streamlit documentation

For Streamlit examples check out the following link

The py file for the streamlit is below

Docker:

A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image.

Image credit-Docker Documentation

To know more about docker check out the below

What's the difference between VM and Docker?

Before creating the docker file let's create the requirements.txt file which will be used in the docker file.

The requirements file in our case is below

Image by the author

The requirements file contains all the packages we need in the application. In our case, we need the above libraries like TensorFlow, Streamlit, pandas, matplotlib, etc.

Why we need the requirements file in the docker file?

Our docker file is below

Image by the author

The docker file contains the following

  • Docker images can be inherited from other images. Therefore, instead of creating our own base image, we’ll use the official Python image that already has all the tools and packages that we need to run a Python application. We are using Python 3.7. Why Slim? The slim image is a paired-down version of the full image. This image generally only installs the minimal packages needed to run your particular tool. By leaving out lesser-used tools, the image is smaller. Use this image if you have space constraints and do not need the full version. But be sure to test thoroughly when using this image! If you run into unexplained errors, try switching to the full image and see if that resolves it.
COPY . .
  • Create the working directory. Create the variable which contains the working directory. Copy all the local files to the working directory. This COPY command takes all the files located in the current directory and copies them into the image.
RUN pip3 install -r requirements.txt
  • After the copy then run the pip install all the packages mentioned in the requirements.txt file. This works exactly the same as if we were running pip3 install locally on our machine, but this time the modules are installed into the image.
  • Now we have the python installed. Then all the dependencies installed.
  • Now, all we have to do is to tell Docker what command we want to run when our image is executed inside a container. We do this using the CMD command. We want to execute the streamlit app. The streamlit app is executed disabling CORS protection by running Streamlit with the --server.enableCORS flag set to false
CMD [ "streamlit", "run","--server.enableCORS","false","myapp.py" ]

Kubernetes:

We have already created the docker file and why we need Kubernetes.According to Kubernetes documentation

Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications.

Image credit-Kubernetes Documentation

Kubernetes is an open-source container management software developed in the Google platform. It helps you to manage a containerized application in various types of physical, virtual, and cloud environments.

Kubernetes simplifies the deployment and configuration of complex containerized applications and it helps with topics like scaling and load balancing. Kubernetes was created at Google originally and later donated to the Cloud Native Computing Foundation (CNCF). It is now managed and maintained by CNCF and has strong community support and users around the globe. Google runs roughly 2.5 billion containers using Kubernetes to run its services for users. Kubernetes is available on different cloud platforms such as Google Cloud Platform’s Google Kubernetes Engine (GKE), AWS EC2 Container Service, and Microsoft Azure Container. The CLI tool that is used to interact with the Kubernetes object is known as kubectl.

The alternatives to Kubernetes are

1. Amazon ECS

2. RedHat OpenShift

3. Docker Swarm

4 .Nomad

5. AWS Fargate

All the config files are created in YAML .

We will be creating 2 YAML file

  • Deployment YAML file
  • Service YAML file

To learn more about the deployment and service file please check out the video

The deployment file is as below

Image by the author
  • apiVersion - Which version of the Kubernetes API you're using to create this object
  • kind - What kind of object you want to create
  • metadata - Data that helps uniquely identify the object, including a name string, UID, and optional namespace
  • spec - What state you desire for the object
  • The important point to note is the container image used to build the pod is image: gcr.io/my-vision-project-283816/myapp:v1. This is the image build using the docker file and registered in the GCP registry.
  • The container port is 8501.Streamlit uses port 8501.
  • The Deployment creates two replicated Pods, indicated by the .spec.replicas field.If you want to scale up to more then you increase the replicas to a higher number.

The service yaml file is below

Image by the author
  • The kind is Service
  • The app name is imageclassfier. The same name used in the deployment file.
  • This specification creates a new Service object named “imageclassifier”, which targets TCP port 8501 on any Pod with the app=iamgeclassfier label.
Image by the author — source vmware

Check out the difference between Kubernetes and Docker.

We are in the final stages of the automatic deployment.

Google Cloudbuild:

According to Google documentation

Cloud Build is a service that executes your builds on Google Cloud Platform infrastructure. Cloud Build can import source code from Cloud Storage, Cloud Source Repositories, GitHub, or Bitbucket, execute a build to your specifications, and produce artifacts such as Docker containers or Java archives. Learn more

Cloud Build executes your build as a series of build steps, where each build step is run in a Docker container. A build step can do anything that can be done from a container irrespective of the environment. To perform your tasks, you can either use the supported build steps provided by Cloud Build or write your own build steps.

The other similar products like google cloud build are

  • AWS CodePipeline.
  • CircleCI.
  • Jenkins.
  • GitHub.
  • Postman.
  • GitLab.
  • CloudBees CI.
  • Amazon Elastic Container Service (Amazon ECS)
Image credit -Google Cloud Documentation

We need to create a yaml file. The cloudbuild yaml file is below

Image by the author
  • The first step is to build the docker image.
  • Make sure to give the registry location.
  • The second step is responsible for pushing the Docker image built on step one to Container Registry.
  • The third step is to deploy the pod in the Kubernetes. The filename K8s will contain the deployment YAML file and service YAML file.
  • Also mention the name of the Kubernetes cluster you created. Location and cluster name.
  • We make use of variable ${PROJECT_ID}

Please check out the below video to learn more about YAML .

So far we have created the

  • Model jupyter notebook
  • Streamlit python file
  • Requirements text file
  • Docker file
  • k8s deployment YAML file
  • k8s service YAML file
  • Cloudbuild YAML file.

We need to do some additional manual steps before setting the trigger.

Create the GCP Project:

The below steps are from the google documentation

  • Open the Google Cloud Console.
  • Next to “Google Cloud Platform,” click the Down arrow arrow_drop_down . A dialog listing current projects appear.
    Click New Project. The New Project screen appears.
  • In the Project Name field, enter a descriptive name for your project. If you’re executing a quickstart, use “Quickstart.”
  • To edit the Project ID, click Edit. The project ID can’t be changed after the project is created, so choose an ID that meets your needs for the lifetime of the project.
  • Click Organization and select your organization. In the Location field, click Browse to display potential locations for your project. Click a location and click Select. Click Create. The console navigates to the Dashboard page and your project is created within a few minutes.

Activate the API’s:

In GCP you need to activate the following API’s

  • Google Kubernetes Engine
  • Google Cloudbuild
  • Google Container Registry

Create K8’s Cluster:

We need to create the Kubernetes cluster in GCP.

We can create the cluster using command line interface

Image by the author
gcloud container clusters create mykube --zone "us-west1-b" --machine-type "n1-standard-1" --num-nodes "1" --service-account my-vision-project-283816@appspot.gserviceaccount.com

Name of the cluster: mykube

Image by the author
  • No of nodes 1 — basic cluster

Make sure to give the cluster name correctly in the google cloudbuild YAML file .

Github Desktop:

  • Create a new repository.
  • Organize the files
  • Push it to GitHub.
Image by the author

Create Cloudbuild Trigger:

We are in the last step of the automation.

  1. Connect Github Repository and Cloudbuild: Have your source code ready in a GitHub repository.

Check out the below documentation which contains steps to connect the GitHub repository to cloudbuild.

2. After connecting the cloudbuild and GitHub repository then create the trigger. Check out the documentation on how to create a trigger.

The trigger is created now

Image by the author

Test and Setting up the CICD Pipeline:

The cloudbuild trigger will be triggered if we make a push to the repository. Just make some changes to the Read file and push it. Now can you see the Cloudbuild trigger is triggered?

Image by the author

It takes around 5–6 minutes to complete. You can check the log. It will show the step 1 docker image is build and then pushed to the container registry. Notice you are able to see the output for each of the build steps defined in our cloudbuild YAML file.

Image by the author

Image

Image by the author
Image by the author

Once the build is complete then you can see the status

Image by the author

If it fails then check the log and fix it. I didn’t give the file folder name correctly and it failed one time. One time gave the GKE cluster name incorrectly. So if any errors then check the log and fix it and again run the trigger.

After the successful build, you can see the pod installed. Since we mentioned replicas 2 in the deployment file you see 2 pods created.

Image by the author

The endpoint is created

Image by the author

Now you can test the app

Image by the author
Image by the author

Cleanup:

Please make sure to delete the resources after you are done with the project. Delete the following

  • Pods, services, and endpoints created.
  • Kubernetes cluster
  • Container registry images
  • Storage buckets
  • Cloud build trigger

Just make sure to delete all the objects created so you are not charged .try to use the 300$ credit provided by GCP.

Conclusion:

There are a lot of ways you can deploy the app and create the CICD pipeline. Here I used google Kubernetes engine and cloud build. Maybe try to do it in AWS or Azure. Please feel free to connect with me on LinkedIn

References:

  1. MLOps: Continuous delivery and automation pipelines in machine learning:https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
  2. AI Engineering — MLOps Playlist : https://www.youtube.com/watch?v=K6CWjg09fAQ&list=PL3N9eeOlCrP5a6OA473MA4KnOXWnUyV_J
  3. CICD Pipeline:https://tanzu.vmware.com/cicd
  4. Kubernetes deployment and service:https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
  5. Google Cloudbuild: https://cloud.google.com/build
  6. Streamlit:https://streamlit.io/gallery?type=apps&category=computer-vision-images

--

--

Senthil E
Analytics Vidhya

ML/DS - Certified GCP Professional Machine Learning Engineer, Certified AWS Professional Machine learning Speciality,Certified GCP Professional Data Engineer .