PyMLPipe: A lightweight MLOps Python Package

Indresh Bhattacharyya

Published in

Coinmonks

4 min readJul 12, 2022

Creating a ML model and actually productionalize the ML model are very different in nature.

PyMLPipe helps in

Model Monitering
Model Version control
Data Version Control
Model Parameter tracking
Data Schema Tracking
Model Performance Comparison
One click API deployment

Installation (via pip):

pip install pymlpipe

Usage of PyMLPipe:

Let’s take an example of with Iris dataset with Scikit-learn. This is a classic dataset for classification problem.

Line 1: we are import load_iris

Line 2–5 : we are converting the Iris data set to a pandas DataFrame format

Now lets split the data in for training testing

Here we are splitting the data into train and test set

Now for the fun part!!! let import our Model classed from Sklearn

I have also imported the metrics that can be used.

Getting started with PyMLPipe:

Importing and initialising pymlpipe:

Line 1: we are importing PyMLPipe class for tabular data

Line 2: we are initiating an object of the class

Line 3: We are setting an Experiment name

An experiment can hold multiple runs(I will explain what a run a little down the line)

Line 4: We are setting a model version

Running the tests

We start running the tests with the following block , objectname.run()

with mlp.run():

The above creates a unique runid for the test — you can also specify your own runid mlp.run(runid="sample_run")

The training code will be placed within the ‘with’ block.

Setting tags- we can set specific tags to the associated run , this helps in identifying the test run also for filtering specific runs.

.set_tags(List) — sets a list of tags for the run

.set_tag(String) — set a single tag for the run

Logging Metrics-

.log_metric(metric_name, metric_value) — logs the metrics ex: ‘accuracy’ , ‘precision’

.log_metrics({metric_name: metric_value,metric_name: metric_value}) — :Log multiple metrics at once

Logging Artifacts -

.log_artifact(artifact_name, Pandas Dataframe) — This will save the data frame into artifact folder for

Data Version control
Data schema logging- which we will see in a bit

.log_artifact_with_path(path_of_the_file) — This will save the file from path

New to trading? Try crypto trading bots or copy trading

Register Model-

.sklearn.register_model(model_name, model object) — Registers the Scikit Learn model for Deployment and tracking

Full code

On the above code we created 4 models -

Logistic Regression
Decision Tree
Random Forest
XGboost

Lets Start the UI

you can start the UI simply by writing the command

pymlpipeui

from pymlpipe.pymlpipeUI import start_ui


start_ui(host='0.0.0.0', port=8085)

In the below image you can see all the tests in one place

We can compare the model performances by selecting and comparing the metrics

Here we can see that XGB , Random Forest and Logistic Regression have almost similar results. So lets check one out

We can see the Training Details

2. In the Artifacts We can see the register Artifact details

3. In the Models tab we can see all the parameters the model was trained on

4. Data Schema tab shows the schema of the artifacts we have registered.

5. Lets Deploy the model

Click on the Deploy button to deploy

You can see the deployed Models on “Show Deployment tab”

deployment URL — is the you endpoint, you can send a POST request to get predictions

You can click on the Deployment URL to get a API screen

Copy the request_body and Click post

And You got your prediction

Github link: https://github.com/neelindresh/pymlpipe

Contribution is always Welcome

Documentation: https://neelindresh.github.io/pymlpipe.documentation.io/

Hope you enjoyed the POST. Leave a like and Share.

If you want to connect with me: https://www.linkedin.com/in/indresh-bhattacharya/

Join Coinmonks Telegram Channel and Youtube Channel learn about crypto trading and investing