Easy deployment of machine learning models on Flask

Sahil Angra
3 min readJan 9, 2022

--

Once you have successfully developed a Machine Learning model, the next hurdle is how to expose that model and make it accessible to the outside world. There are many methods available out there but the simplest solution is to deploy it as a web server and access it as an API. This tutorial is designed to help you achieve that.

Primary Concepts:

  1. Saving: Once we have a model ready, we shall save that model, so it can be easily reused at a later stage.
  2. Loading and Inferencing: We need to expose the endpoint from the flask, which, when called, helps read the predictor variables from the body of the request, loads the saved pickle file, predicts the target variable, and finally returns the prediction as a response

Saving

Once you are done with the development of the model and are happy with the evaluation metrics, we then save the model in a file with the help of pickling.

But first, What is Pickling? Pickling is a simple process where you basically break down the python object into byte streams which are then later stored in a pickle file. The extension of the pickle file is “.pkl” / “.pickle”.

Points to remember while pickling:

  1. One pickle file can contain only one data object. If you need multiple data objects in a single file, consider using dictionaries or a list. Ex : object_to_pickle = [ object_1, object_2, object_3, ..]
  2. Pickling is a common process and it does not affect Machine Learning models / their learnings or weights, in any way.

Code:

To save any object in a pickle format

import pickle
data_object_to_pickle = [1,2,3,4,5] # Data to save
file_handler = open(“name_of_file.pkl”, “wb”) # Create and open
pickle.dump(data_object_to_pickle, file_handler)
file_handler.close() # Close file

Entire Code :

Once you run the above snippet, it will generate a pickle file with the name: “my_model” that contains a trained Linear Regression model.

Loading and Inferencing

Once we have the required pickle file, we will set up an endpoint using flask which shall expect a POST request, along with to be predicted data in the body of the request in form of a JSON. The endpoint shall receive the data, load the model from the pickle file, predict, and return the predictions made.

Code:

To load a pickle file

import pickle 

file_handler = open("my_model.pkl", "rb")
pickle_data = pickle.load(file_handler)
file_handler.close()

Entire Code:

Once you run the flask file, it should by default listen on localhost:5000

Running on http://127.0.0.1:5000/

The JSON of the request that contains to be predicted data would look something like

{
“data” : [1,2,3,4]
}

Once you hit the URL with the required data, the prediction is returned as a response.

Flask Response

Points to remember while Inferencing:

  1. The Inference code can be further modified into accepting multiple data points as well.

Hope this tutorial was helpful. Please do like if this helped you and do not hesitate to leave a comment.

--

--

Sahil Angra

A professional who handles everything from garnering of data to the creation and deployment of ML models.