Tuning Hyper Parameter using MlOps

ankit
5 min readAug 30, 2020

--

Modern era is the era of technology.We are living in the era of technology like Machine-Learning , Artificial Intelligence , Automation etc.The demand of machine-learning engineers are growing day-by-day.To build and develop the models are not a big-deal but main issue is accuracy , performane and automation.So to increase the performance and accuracy , we will have to select best features , best algorithms and hyper-parameters.

What is Hyper Parameters in ML ?

These are adjustable parameters that must be tuned in order to obtain a model with optimal performance.

Examples of Hyper Parameters :

  1. Perceptron Classifier
Perceptron(n_iter=40, eta0=0.1, random_state=0)

Here, n_iter is the number of iterations, eta0 is the learning rate, and random_state is the seed of the pseudo random number generator to use when shuffling the data.

2. Train, Test Split Estimator

train_test_split( X, y, test_size=0.4, random_state=0)

Here, test_size represents the proportion of the dataset to include in the test split, and random_state is the seed used by the random number generator.

3. Logistic Regression Classifier

LogisticRegression(C=1000.0, random_state=0)

Here, C is the inverse of regularization strength, and random_state is the seed of the pseudo random number generator to use when shuffling the data.

To find the optimal hyper parameters is not so easy.It is very tiring and time consuming as well.So to solve this problem I’m using MLOps ( Machine Learning on DevOps ) . MLOps automate the complete job and build an optimal model for us.

What is MLOps ?

MLOps is a recent term that describes how to apply DevOps principles to automating the building, testing, and deployment of ML systems.
“A software engineering approach in which a cross-functional team produces machine learning applications based on code, data, and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles.”

DevOps vs MLOps

“High Level MLOps CI/CD Workflow triggered by changes in either source code or data”

Advantages of MLOps :

1.Data Versioning, Git-Style

2.Data Versioning with Time-Travel Queries and Incremental Pulling

3.End-to-End ML Pipelines

4.Monitoring Online Models

Problem Statement :

  • Create container image that’s has Python3 and Keras or numpy installed using dockerfile
  • When we launch this image, it should automatically starts train the model in the container.
  • Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins
  • Job1 : Pull the Github repo automatically when some developers push repo to Github.
  • Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the cnn processing).
  • Job3 : Train your model and predict accuracy or metrics.
  • Job4 : if metrics accuracy is less than 80% , then tweak the machine learning model architecture.
  • Job5: Retrain the model or notify that the best model is being created
  • Create One extra job job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left.

Dockerfile : It is used to build the custom images.

$ sudo docker build -t mytensor:v1 PATH

Job 1 : Pull GitHub Code

Whenever the developer will push any code to GitHub , this job will copy that code into the local repository in our system. For this I have used Poll SCM to keep on checking the remote repository for any changes.

Poll SCMpolls the SCM periodically for checking if any changes/ new commits were made and shall build the project if any new commits were pushed since the last build, whereas the “build” shall build the project periodically irrespective to whether or not any changes were made.

Job 2 : See Code and Launch Respective Container

It checks some must having keywords like ‘keras’ , ‘keras.layers’ and check if these keywords are in our program(file1.py) then launch respective container otherwise display message : “Conatiner not avaialable for your program”.

If container exist , It launch the container and start training our model.

The file1.py code contains LeNet for MNIST dataset but I modified some layers for finding optimal hyper parameters. Input is taken form file “input.txt”

Job3 : Display Result

The task done by it very simple that is the accuracy along with the setup is deployed on the Apache Web Server so that user can directly access.

Job 4 : Analysis of Accuracy

This job performs the following tasks :

+ Checks accuracy , if accuracy is less than 80% , then tweak the code using program tweaker.py and again twaek job 2 — see code and launch and run the model once again.

+ If accuracy is > 80 % it calls job 5 : model created successfully.

The first curl command is to trigger job 2 since our hyper parameters are tweaked and ready to be tested.

The second curl command is to trigger job 5 on successful model creation.

Job 5 : Model Created Successfully

This job sent an email to developer if our model is successfull develop.

Job 6 : Extra Job

If any condition job2 failed this job restart docker engine and triggered job2.

Reference :

1. Python Machine Learning”, 2nd Edition, Sebastian Raschka.”

2. Hyper Parameter : https://towardsdatascience.com/understanding-hyperparameters-and-its-optimisation-techniques-f0debba07568

3.MLOps : https://mlops.org/the-next-generation-of-devops-ml-ops/

--

--