PyMLPipe: A lightweight MLOps Python Package

Indresh Bhattacharyya
Coinmonks
Published in
4 min readJul 12, 2022

--

Creating a ML model and actually productionalize the ML model are very different in nature.

PyMLPipe

PyMLPipe helps in

  1. Model Monitering
  2. Model Version control
  3. Data Version Control
  4. Model Parameter tracking
  5. Data Schema Tracking
  6. Model Performance Comparison
  7. One click API deployment

Installation (via pip):

pip install pymlpipe

Usage of PyMLPipe:

Let’s take an example of with Iris dataset with Scikit-learn. This is a classic dataset for classification problem.

Line 1: we are import load_iris

Line 2–5 : we are converting the Iris data set to a pandas DataFrame format

Now lets split the data in for training testing

Here we are splitting the data into train and test set

Now for the fun part!!! let import our Model classed from Sklearn

I have also imported the metrics that can be used.

Getting started with PyMLPipe:

Importing and initialising pymlpipe:

Line 1: we are importing PyMLPipe class for tabular data

Line 2: we are initiating an object of the class

Line 3: We are setting an Experiment name

An experiment can hold multiple runs(I will explain what a run a little down the line)

Line 4: We are setting a model version

Running the tests

We start running the tests with the following block , objectname.run()

with mlp.run():

The above creates a unique runid for the test — you can also specify your own runid mlp.run(runid="sample_run")

The training code will be placed within the ‘with’ block.

Setting tags- we can set specific tags to the associated run , this helps in identifying the test run also for filtering specific runs.

.set_tags(List) — sets a list of tags for the run

.set_tag(String) — set a single tag for the run

Logging Metrics-

.log_metric(metric_name, metric_value) — logs the metrics ex: ‘accuracy’ , ‘precision’

.log_metrics({metric_name: metric_value,metric_name: metric_value}) — :Log multiple metrics at once

Logging Artifacts -

.log_artifact(artifact_name, Pandas Dataframe) — This will save the data frame into artifact folder for

  1. Data Version control
  2. Data schema logging- which we will see in a bit

.log_artifact_with_path(path_of_the_file) — This will save the file from path

New to trading? Try crypto trading bots or copy trading

Register Model-

.sklearn.register_model(model_name, model object) — Registers the Scikit Learn model for Deployment and tracking

Full code

On the above code we created 4 models -

  1. Logistic Regression
  2. Decision Tree
  3. Random Forest
  4. XGboost

Lets Start the UI

you can start the UI simply by writing the command

pymlpipeui

or

from pymlpipe.pymlpipeUI import start_ui


start_ui(host='0.0.0.0', port=8085)

In the below image you can see all the tests in one place

PyMLPipe UI

We can compare the model performances by selecting and comparing the metrics

Here we can see that XGB , Random Forest and Logistic Regression have almost similar results. So lets check one out

  1. We can see the Training Details

2. In the Artifacts We can see the register Artifact details

3. In the Models tab we can see all the parameters the model was trained on

4. Data Schema tab shows the schema of the artifacts we have registered.

5. Lets Deploy the model

Click on the Deploy button to deploy

You can see the deployed Models on “Show Deployment tab”

deployment URL — is the you endpoint, you can send a POST request to get predictions

You can click on the Deployment URL to get a API screen

Copy the request_body and Click post

And You got your prediction

Github link: https://github.com/neelindresh/pymlpipe

Contribution is always Welcome

Documentation: https://neelindresh.github.io/pymlpipe.documentation.io/

Hope you enjoyed the POST. Leave a like and Share.

--

--