Hyperparameter Tuning With MLflow Tracking

MLflow simplifies tracking and reproducibility for tuning workflows

Published in

The Startup

6 min readAug 3, 2020

Hyperparameter tuning and optimization is a powerful tool in the field of AutoML. Tuning these configurations can dramatically improve model performance. However, hyperparameter tuning can be computationally expensive and slow.

If you have a large network like VGG, Resnet, etc. trying out every parameter exhaustively and then choosing the best parameter is computationally intensive because in hyperparameter tuning we use various configurations. To do that, we need a principle approach to choose the best parameter.

Hyperparameter tuning creates complex workflows involving testing many hyperparameter settings, generating lots of models, and iterating on an ML pipeline. To simplify tracking and reproducibility for tuning workflows, we use MLflow, an open-source platform to help and manage the complete machine learning lifecycle.

How can I find the best version of this model?

How to track all the information and document of each model trained?

How do we do that efficiently?

MLflow makes this process much more efficient and convenient!!

Let’s start

What is MLflow?

MLfow is a python package developed by databricks that is defined as an open-source platform for the machine learning lifecycle. There are three pillars around mlflow ().

Their documentation has a nice tutorial to explain the component of mlflow. Mlflow lets you log parameters and metrics which is incredibly convenient for model comparison.

Colab on local runtime

So before starting to MLflow we will first connect to our colab on a local runtime. You are going to need Jupyter notebooks. It is what Colab built their platform on top of and it is required to run a local notebook. If you have Jupyter, you are already ahead of the game if not, click here for information.

Step 1: install jupyter_http_over_ws using the command ‘pip install jupyter_http_over_ws’

Step 2: Enable the jupyter_http_over_ws jupyter extension using command jupyter serverextension enable — py jupyter_http_over_ws

Step 3 : Start server and authenticate using command jupyter notebook \ — NotebookApp.allow_origin=’https://colab.research.google.com' \ — port=8888 \ — NotebookApp.port_retries=0

Once the server has started, it will print a message with the initial backend URL used for authentication. Make a copy of this URL as you’ll need to provide this in the next step.

Jupyter notebook will automatically pop-up in your browser, but if it does not you can look at the highlighted link, that is the local http:// you will need to copy and paste in your browser.

now the notebook is generated, but the notebook still needs to be connected to computing power. You will be able to do this by going to your colab page. Make sure your colab notebook settings are set to the GPU. Go to Edit > Notebook Settings and make sure you are running the right environment and that the hardware accelerator is set to the GPU.

Click the “Connect” button and select “Connect to local runtime…”. The local connection setting window will pop up as shown below.

Enter the URL from the previous step in the dialog that appears and click the “Connect” button. After this, you should now be connected to your local runtime.

Now you have GPU power through your local google colab notebook.

MLFlow installation and basic usage:

Step 1: pip install mlflow

Step 2: MLflow Python API logs run locally, in a mlruns directory wherever you ran your program. You can then run mlflow ui to see the logged runs.

Enter the URL shown in the above image which provides a simple interface to various functionality in MLflow.

Hyperparameter Tuning

Now we have connected google colab notebook to runtime and also installed mlflow.

Let’s solve one practical example, Digit recognition where mlflow will help you to find the right set of optimal hyperparameters for a learning algorithm.

We are using the MNIST dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9. The task is to classify a given image of a handwritten digit into one of 10 classes representing integer values from 0 to 9, inclusively. It has three tunable hyperparameters that we try to optimize: learning-rate, momentum, and number of hidden nodes.

We have loaded the Data set and transformed it as required for now we have everything in place.

MLflow Tracking is organized around the concept of runs (mlruns), Each run records some information like which is explained as below.

Create an experiment id using mlflow.create_experiment() creates a new experiment and returns its ID. Runs can be launched under the experiment bypassing the experiment ID to mlflow.start_run() which returns current active run if you have any else it will create new active run and return its object.

We have logged the value of the metric train loss, test loss, and test accuracy using mlflow.log_metric() which logs into a single key-value metric.

At each run, we have logged parameter like start time, batch size, epochs, Learning rate, momentum, hidden nodes, and test loss along with its source code using mlflow.log_param() which logs into a single key-value param in the currently active run.

select the different models for comparison and click on the compare button.

Select the x-axis and y-axis parameters to compare the selected model and check the performance of the model.

you can also check the performance of the individual model and check different metrics plots(for example, to track how your model’s loss function is converging).

you can record images (for example, confusing metric PNGs for each run), models (for example, a pth PyTorch model), or even different data files as artifacts using mlflow.log_artifact().

So, we logged all run for each hyperparameter setting, and each of those runs includes the hyperparameter setting and the evaluation metric. Comparing these runs in the MLflow UI helps with visualizing the effect of tuning each hyperparameter.

Once you come up with a better configuration of hyperparameter you can load that model specifying the run id and use for inference.

To know more about mlflow functions you can visit the official page of mlflow.

So this is how we can do hyperparameter tuning using Mlflow tracking.

Conclusion

Using Mlflow we can easily track and manage the different configured trained models and compare them easily to find the best set of hyperparameters.

Feel free to comment if you have any feedback for me to improve on, or if you want to share any thoughts or experience on the same.

Do you want more? Follow me on LinkedIn, and GitHub.