Use TorchServe and Flask to Deploy your Model as a Web App

Develop a simple Web Application for your PyTorch Model

Datamapu
7 min readDec 3, 2022
Photo by Brian McGowan on Unsplash

Introduction

This is the fourth post in a series around serving your PyTorch model with TorchServe. In the previous posts we discussed the general workflow of how to serve a model with TorchServe.

Next, we looked into the BaseHandler class provided by TorchServe. We can use this class to provide all the information about preprocessing, inference, and postprocessing:

Then we discussed how we can customize the BaseHandler class and applied the previous discussed concepts to predict digits using the MNIST Dataset and a simple CNN:

In this post we will discuss how we can develop a web application, with Python, where our model is applied. There are different possibilities how to achieve this, here we will use Flask. We will containerize the workflow with Docker. You can find the entire code and example images on github.

Introduction to Flask

Flask is a web framework written in Python. A web framework refers to a collection of libraries and modules that makes developing web applications easy without having to worry about low-level details. Flask is more specifically a microframework, that means it doesn’t require other tools or libraries. It is made to design and create web apps. Flask is a WSGI (pronounced: “Whiskey”) framework. This stands for Web Server Gateway Interface, which is a way for web servers to pass requests to web applications or frameworks.

Before implementing our little app, let’s get started with the minimal example from the Flask documentation.

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello_world():
return "<p>Hello, World!</p>"

The explanation a bit shortened copied from the documentation page.

  1. Import Flask class. An instance of this class is our WSGI application.
  2. Create an instance of this class. The first argument is the name of the application’s module or package. __name__ is a shortcut for this that is appropriate for most cases. It tells Flask where to look for resources like templates and static files.
  3. Use the route() decorator to tell Flask what URL should trigger the function.
  4. The function returns a text we want to display in the user’s browser. (The default content type is HTML, so HTML in the string will be rendered by the browser.)

Save the script ashello.py. To run the application, use:

$ flask --app hello run
* Serving Flask app 'hello'
* Running on http://127.0.0.1:5000 (Press CTRL+C to quit)

The “ — app” option tells Flask where the app is. You can then go to “localhost:5000” to see the output “Hello World” of the app.

MNIST Application

Overview

Our MNIST app should have the following functionality. The app should contain a form to upload the image and a button, that should be pressed in order to make the prediction. We will keep the layout very minimalistic and concentrate on the functionality. The html-form will be stored in a file called “index.html”. This file needs to be stored in a subfolder called “templates”. The subfolder needs to have exactly this name, otherwise Flask won’t find it. The app itself will be stored in a file called “app.py”. That is in the main folder, we have a subfolder called “app”, in this folder everything containing our app is stored:

Structure of the subfolder of the web app

The HTML File

The HTML file is pretty straight forward. After the title, the <head> contains a stylesheet link to load css design from Bootstrap. Next follows a short javascript function, that is triggered, when the image is uploaded. It creates a URL representing the image and this enables us to show the image easily. For more details, e.g. Espen Hovlandsdal provides a nice explanation on schnipsed.com.

The <body> contains the actual form with the image and a button, that triggers the prediction. The form has “method=POST”, that is we can access the uploaded image with “request” in the Python script. The prediction is then calculated by the model and printed. The prediction is accessed using Django syntax.

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>My Machine Learning Model</title>

<!--CSS using bootstrap-->
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3" crossorigin="anonymous">
<script>
var loadFile = function(event) {
var image = document.getElementById('output');
image.src = URL.createObjectURL(event.target.files[0]);
};
</script>
</head>

<body>
<br>
<br>
<form method="post" action="{{url_for('predict')}}" enctype="multipart/form-data">
<p>
<input type="file" accept="image/png" name="file" id="file" onchange="loadFile(event)" style="display: none;">
</p>
<p>
<label for="file" style="cursor: pointer;">Upload Image</label>
</p>
<img id="output" width="200" />
<br>
<br>
<button type="submit" class="btn btn-primary btn-block btn-large">Predict Value!</button>
</form>
<br>
<br>
<b> {{ prediction_text }} </b>
</div>
</body>
</html>

The Flask Script

Let’s now have a look at the Flask script. After instantiating our app, we set the file path of the uploaded image to our current directory joined with the “static” folder. Our script further consists of two functions “home” and “predict”. Both have a “route()” decorator and the “predict” function is routed to the “predict” subpage. The “home” function simply returns the “index.html” page, while the “predict” function receives the input from the HTML form and makes the prediction.

Let’s have a closer look at the “predict” function. Only if the “request.method==POST” — i.e. when the “Predict Value!” button in the HTML script is pressed — the prediction is calculated and printed. After checking that the filename is not empty the image path is generated, which we need to request the prediction. The filename is generated using the “secure_filename“ method from “werkzeug.utils”. This reduces the file path to a flat filename. This is done for security reasons, for more details see Miguel Grinberg’s blog about file uploads. The requested prediction is then returned and printed on the website. The request URL in this case is a Docker image created from the official torchserve image, which we will consider in the next section.

import os
import requests
from flask import Flask, request, render_template, redirect
import pickle
from werkzeug.utils import secure_filename

app = Flask(__name__)

app.config['IMAGE_UPLOADS'] = os.path.join(os.getcwd(), 'static')

@app.route('/', methods=['POST', 'GET'])
def home():
return render_template('index.html')

@app.route('/predict',methods=['POST', 'GET'])
def predict():
"""Grabs the input values and uses them to make prediction"""
if request.method == 'POST':
print(os.getcwd())
image = request.files["file"]
if image.filename == '':
print("Filename is invalid")
return redirect(request.url)

filename = secure_filename(image.filename)

basedir = os.path.abspath(os.path.dirname(__file__))
img_path = os.path.join(basedir, app.config['IMAGE_UPLOADS'], filename)
image.save(img_path)
res = requests.post("http://torchserve-mar:8080/predictions/mnist", files={'data': open(img_path, 'rb')})
prediction = res.json()
return render_template('index.html', prediction_text=f'Predicted Number: {prediction}')

if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=9696)

Containerize the App with Docker

We will start the app using docker-compose, which contains two services. One for torchserve and one for the app. The docker-compose.yaml file looks as follows:

version: '3.7'

services:
torchserve-mar: # container-name
image: torchserve-mar:v1
ports:
- "8080:8080"
- "8081:8081"
mnistapp:
build: app/
ports:
- "9696:9696"

The two services are called “torchserve-mar” and “mnistapp”. The “torchserve-mar” service is a Docker image that builds on the official torchserve image, additionally includes the “model-store” folder, where the “.mar” file is stored and then starts torchserve. How to create this “.mar” was discussed in a previous post. file The Dockerfile looks as follows:

FROM pytorch/torchserve:latest

COPY ["./model-store", "./model-store"]

CMD ["torchserve", "--start", "--model-store", "model-store", "--models", "mnist=mnist.mar"]

The Docker image can be build using the command:

build -t torchserve-mar:v1

The second service builds the Docker image directly. The associated Dockerfile is stored in the “app” subfolder and has the following structure.

# baseimage
FROM python:3.9

# install pipenv
RUN pip install pipenv

# creates directory "app" and moves there
WORKDIR /app

# copy Pipfile and Pifile.lock to current directory
COPY ["Pipfile", "Pipfile.lock", "./"]

# create virtual environment directly on the system
RUN pipenv install --system --deploy

# copy the python script we need
COPY . ./

# expose the port
EXPOSE 9696

# define entrypoint
ENTRYPOINT ["gunicorn", "--bind=0.0.0.0:9696", "app:app"]

It copies and installs the needed packages from a Pipenv virtual environment and then starts the app.

For the exact folder structure, please refer to the github repo. To run the app move to the folder, where the docker-compose.yaml file is stored and use:

docker-compose up

Then in your browser navigate to localhost:9696 to see and use the web app.

Screenshots of the app running on localhost:9696

References

Find more Data Science and Machine Learning posts here:

--

--