Saving and Loading a Trained Machine Learning Model for Inferencing

Peter Woodman

3 min readMar 26, 2023

Prerequisites

Machine Learning Basics
Python Basics

What happens after a model is trained?

Model training is a very important stage in modeling. It involves feeding an untrained model with sufficient data to understand and recognize the patterns in data. Typically, based on the training data and the complexity of the model involved, the training make even take hours if it involves complex data like images, videos, etc.
After a machine learning / deep learning model is trained successfully, the trained model will be used for inferencing. In the normal scenario, the trained model is stored in the python kernel’s memory as an object. So during inferencing, we can use this object reference to access the model and get the desired output for the respective inputs.
Though this method looks direct and straightforward, there are some challenges associated with it when it comes to industrializing it.

Challenges

If I restart the python kernel, the trained model object will be removed from the memory, so do I have to train the model again from scratch?
If I want to share this model for inferencing with another person, how can I achieve it without involving the training phase?
If I want to industrialize this trained model, how will I set it up in the server without involving the training phase?

Solution

To avoid training each time to invoke a model and remove the above-discussed bottlenecks we have to save this model in a reusable manner.
Once we have saved the trained model, we can initialize this trained model in the required instance and take it forward for inferencing. This activity is commonly called pickling.

“Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. — Python Documentation

The process followed in model pickling is called serialization. And, initializing the saved model back to a variable (or unpickling) is called deserializing.

Most Commonly Used Methods

Pickling using the Pickle module
Pickling using the Joblib module

Steps for using Pickle module

1. Import the pickle module

import pickle

2. Train a model

model.fit(xtrain, ytrain)

3. Save the model using pickle.dump.

pickle.dump(model, open(picklePath, "wb"))

Here, picklePath represents the path to store the generated pickle file
Once the pickle file is generated it can be shared and used for inferencing

4. Load the model for inferencing use pickle.load

model = pickle.load(open(picklePath, "rb"))

Here, picklePath represents the path where the pickle file is stored.
The model variable has reference to the trained model.

5. Use the loaded model for inferencing

model.predict(data)

Similar to how we would have treated a trained model variable we can treat this model variable and take it forward for getting output.

Steps for using Joblib module

1. Import the joblib module

import joblib