Integration of ML and DevOps to achieve automation using Github Docker and Jenkins

Hello everyone

This article is all about the Integration of Machine learning with DevOps to achieve Automation using Docker Jenkins and Github.

This will increase the accuracy of the ML models.

Basic terminology for better understanding is achieved by understanding the two important things.

  1. What is Automation ? and its need .
  2. What do you understand by “HyperParameters

So, Automation is the process of automating the process of applying machine learning to real world problems.

Due to Human constraints it is a tough job to maintain accurate models. And to provide it accuracy by continuous monitoring the ML models , Automation is required.

The above Description is the Problem Statement for the given task.

HyperParameters are those parameters whose value is used to control the learning process. In our task we will set the parameters in starting.

This is challenging because, every time we change the hyper parameters, we need to train the model again. Doing this manually requires a lot of time and energy.

Hence we require Automation of this process using Github Docker & Jenkins On Redhat linux version 8 by using Machine learning models.

Task Description

1. Create container image that’s has Python3 and Keras or numpy installed using dockerfile

2. When we launch this image, it should automatically starts train the model in the container.

3. Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins

4. Job1 : Pull the Github repo automatically when some developers push repo to Github.

5. Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the cnn processing).

6. Job3 : Train your model and predict accuracy or metrics.

7. Job4 : if metrics accuracy is less than 80%, then tweak the machine learning model architecture.

8. Job5: Retrain the model or notify that the best model is being created

9. Create One extra job job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left

To achieve the required Model following are the step by step process.

As mentioned in the Task Description we will first create a docker file and then the docker image

Now we will build this image using the syntax:

“docker build -t imagename:version <path of dockerfile>”

Now we need to create a folder containing our python code and dataset of training our model . which is pushed on github using GITBASH.

Use these commands to push the code to your repo.

git remote add origin “github_repo link”

git push –set-upstream origin master.

Now we will create jobs in Jenkins using Build Pipeline

Job 1: This copies your code from GitHub to your local folder

In above screen i have pulled our repo using repository URl.

Now we will build trigger by using the remote trigger and also set up the poll SCM without scheduling, so as to trigger after the post commit hook.

Job 2: Train the model on respective Docker container i.e. if the code is for Machine Learning, we will start the Machine Learning container else we launch the Deep Learning container.

Also , after training the model, tweak the hyper parameters if the required accuracy is not achieved and retrain the model. This will be done using a python code “tweak.py”.

In ‘execute shell’ in the build option we will add the script to train our model.

Job3 : It just trains my ML model in the docker-run OS. My python code then stores the accuracy in a new file, using file writing in python.

for this we need to notify that the best possible model is trained, for that we will use email notification by using Email plugins provided by jenkins

Job 4: This Job checks if the accuracy of the model falls below 80%. If so, it edits the model such that it reaches the desired accuracy.

If the accuracy crosses 85%, it sends an email to the developer with the build log showing the final accuracy.

Job 5 simply executes the edited python code again

Job 4 and Job 5 are interlinked in such a manner that they would keep on calling each other till the accuracy of or above 80% is reached.

Job 6: this job is an extra job which is also a fail- safe job which will use the trigger and repeat the whole process again and restart the container from where the last retrained model left.

Here i am attaching the console outputs of all the 4 jobs and how my final pipeline looked like.

Here is the final Email which is received on the mentioned email after my model achieved 80% of accuracy

TASK COMPLETED

For any suggestion and query feel free to text me on my linkedin account

https://www.linkedin.com/in/sajalsaxena0234/

ML enthusiast | DEVOPS & ML Integration | python developer | Open source contributor | Red hat v8