ML Flow Basic Approach — Part 1(Logging)

Rohit Jain
4 min readApr 1, 2019

--

ML Flow

Not many might have heard about this but, not many would ignore it in the near future. As we are approaching to the age of Artificial Intelligence we need as many tools as possible to ease out the process of developing, deploying and tracking the machine learning models.

I decided to write this article on Medium as very little information is available about this tool out on the internet.

So here is the simple approach of ML FLow.

Why ML Flow and What is ML FLow ?

Every Data Scientist here might have faced this situation in their data science career that, what parameters or metrics that they had passed to a model had fetched them the best outcome a few days back or may be a month back.

Yes, this issue is quite common and that’s where ML Flow can help you. Ml Flow also helps in solving the problem of environment setup issues, deployment issues faced by many data scientist.

ML Flow you can say, an output of the hard work done by the developers of Databricks. It basically deals with these three challenges of ML Life Cycle

  1. Tracking of parameters (MLFlow Tracking)
  2. Deployment (MLFlow Projects)
  3. Lot of Tools(MLFlow Models)
ML FLow Components

Lets see how ML FLow deals with all these challenges.

Initially let us run a basic program developed in ML Flow

  1. Run the following command on your terminal or command prompt to install ML Flow

pip install mlflow

2. Run the following command to try out the first running program from github

  • Recommended to make a folder and execute the command from that folder ( Might take a minute)

mlflow run https://github.com/jain-roh/MlFLowExample.git

3. Hopefully everyone gets the following output

Output

4. There you see value of multiplication is 30. Now let’s run the same program by passing the values instead of the default values

mlflow run https://github.com/jain-roh/MlFLowExample.git -P a=8 -P b=7

5. Everyone gets the Multiplication values as 56, so here I have passed values of a as 8 and b as 7

6. I will show how to deploy the application on git (Tutorial 2) which later can be executed directly from the command line without even cloning it

7. This can be linked to solving of the second challenge. As you see you don’t need to install packages or even python to execute this code. It is all done by ML Flow

Now let’s write a basic program ( Same as above executed code) using ML Flow to track parameters, metrics and even artifacts. Artifacts can be any file saved to view as a result.

  1. Save the following file as MLflow.py

2. Now run the program

python MLFlow.py

3. Now run the following command in command line or terminal

mlflow ui

4. Open the browser to view mlflow tool. Usually the port to access mlflow is 5000

https://localhost:5000

ML Flow UI - Tracking Projects

5. Click on the link having date and should show the screen similar to the one below

ML FLow UI — Tracking Parameter, Metrics

6. Let’s explore further, scrolling down(screenshot below) we can also see the artifact that we had saved as txt file. Refer to the code above of how we saved artifact. Artifact can be any file like a python file, image file, text file or it can also be a plotted image as well.

ML FLow UI — Tracking Artifacts

Summarizing above we have explored 3 methods : logging parameters, logging metrics and logging artifacts. This is how we solve the very first challenge of tracking parameters.

Soon I’ll be updating on environment setup such as how to upload on git with all the required files and run directly from git without even cloning or downloading the project.( Shown above in the first example)

Suggestions can be passed in the comment section, please do so.

Part 2 on Deployment has been published -> Part 2

--

--

Rohit Jain

I am a graduate student having 2 years of work experience as a software engineer. https://www.linkedin.com/in/jain-roh