Making an API out of a Hugging Face model — the code: part 2
Summary
In the last post we showed you how to match sections of a CV to a list of skill names without turning it into an API. Here we will add the code that will turn that code into an API and serve it on Cloud Run.
What we will cover in this post:
- Preparing requirements.txt
- Rewriting the main code
- Preparing Dockerfile and a helper script
- Deploying to Cloud Run
Like in the previous post, the codes you see in this post can be obtained from this repository.
requirements.txt
: the code that specifies what Python modules should be installed in the container before you startmain.py
: the main code, which isrecommend_without_cloudrun.py
behind an APIDockerfile
: the build script for the containerhelper_script.sh
: some scripts that need to be executed within the Docker container after the build
This post will be written as if you are starting from recommend_without_cloudrun.py
and transforming it into main.py
. If you prefer not to code along, you can choose to follow and understand the new pieces of code that were added.
Who is writing this series?
This series is a joint effort by both Datamarinier, a data strategy company, and huapii, a developer and evangelist of a skills and performance management tool. By combining Datamarinier’s customized data solutions with huapii’s emphasis on unlocking human potential, this article delves into the intersection of technology and talent.
Prerequisites
In order to follow along with this series, you will need:
- Access to Google Cloud Platform (GCP)
- An active billing account. Don’t worry, Cloud Run is pretty cheap. Just testing it should cost less than $1 and definitely will not break your bank.
- Intermediate to advanced Python programming skills
- Basic bash scripting skills
Prepare requirements.txt
When creating a containerized app, requirements.txt
is a handy way to specify which Python modules we need to execute our code. This is simply a list of module names that will be installed in the container as it is being prepared. The best practice is to specify version numbers so you know all the libraries will play nicely with each other.
requirements.txt
Flask==2.3.2
flask_restx==1.2.0
gunicorn==21.2.0
pandas==2.1.3
pypdf==3.17.1
sentence-transformers==2.2.2
werkzeug==2.3.3
torch==2.1.1
torchvision==0.16.1
Above, you can see that the list includes modules that were not included in recommend_without_cloudrun.py
. These are all modules needed to turn our main code into a Flask API, except for torchvision
, which is just there because torch
needed it to run.
Rewrite the main code
Adding the modules you specified in requirements.txt
Now that you have requirements.txt
, don’t forget to add the modules to your main script as well. The first few rows of your main code should look like this now.
import pandas as pd
import pypdf
import pickle
import re
from sentence_transformers import SentenceTransformer, util
import torch
from flask import Flask, request, json, Response
from flask_restx import Api, Resource, fields, abort
import werkzeug
import numpy as np
Adding the Flask definitions
To take the code from recommend_without_cloudrun.py
and transform it into an API, you first need to add the following to the beginning of the script.
# the Flask app name is 'app'
app = Flask(__name__)
app.json.ensure_ascii = False
# API headers
api = Api(app, version='1.0', title='Skills from CV API',
description='API to parse CVs and return skill recommendations')
# POST argument parser
file_upload_parser = api.parser()
file_upload_parser.add_argument('file', location='files', type=werkzeug.datastructures.FileStorage, required=True,
help='PDF file to be parsed')
The first block creates a Flask application object called app
within your code. This will always be the first step of using Flask to create web apps. The second row is not mandatory, but it allows the API to return non-ASCII characters like ö.
The second block uses the library Flask_RESTX
to specifically specify that the Flask object is going to be a REST API, and also add basic documentation to it. We won’t go into the details of what a REST API is here, so just remember it as a very popular architecture style of APIs, see also here. After the second block is executed, the Flask application “app” is redefined as REST API “api”.
The third block adds a mandatory argument “file” to the API and specifies it as a file object in the request. In other words, if a user sends a file to the endpoint (more on endpoints in a little while), you can refer to it in your code with the name file
.
Rewriting the main function
Now let’s rewrite the main function in recommend_without_cloudrun.py
When you use Flask_RESTX, you don’t define a main script anymore, but use a class marked with a specific endpoint (remember, an endpoint is a special URL where the API receives input). Within the class, you code tasks for different HTTP request methods such as GET and POST. If you don’t know what HTTP request methods are, just remember them as protocols that specify tasks to be performed on a particular endpoint. In our case we only need the task for POST since it is the only method that can receive files as input.
Our rewritten main function looks like this now.
@api.route('/skills_from_cv')
class SkillsFromCV(Resource):
@api.expect(file_upload_parser)
def post(self, top_k=5, write_parse_results=False):
'''
Get a set of suggestions for skills from a CV
'''
args = file_upload_parser.parse_args()
# get the CV file from the arguments and save it as file.pdf
input_file = args['file']
input_file.save('file.pdf')
# get the list of skill names and the corresponding binary
# the Cloud Run app runs in /usr/src/app, and in the Dockerfile we copied the embeddings directly under there
master_skills_emb_binary = r'./master_emb_list.pkl'
master_skills_list = r'./master_skills_list.txt'
with open(master_skills_emb_binary, 'rb') as f:
master_phrase_embs = pickle.load(f)
with open(master_skills_list, 'r') as f:
lines = f.readlines()
master_phrase_list = []
for l in lines:
master_phrase_list.append(l.replace("\n", ""))
# read the pdf in as text
file_text = read_pdf_text('file.pdf')
# get recommendations
cv_snippets = cut_and_clean(file_text)
skill_recommendation = match_snippets(cv_snippets,
master_phrase_embs,
master_phrase_list,
top_k=top_k)
# throw away recommendations with scores under 0.5,
# sort recommendation by scores, drop duplicate skill suggestions
skill_recommendation = skill_recommendation[skill_recommendation['Score'] >= 0.5]
skill_recommendation = skill_recommendation.sort_values('Score', ascending=False)
skill_recommendation = skill_recommendation.drop_duplicates(subset='Phrase').reset_index(drop=True)
skill_recommendation = skill_recommendation.rename(columns={'Phrase': 'Skill'})
skill_recommendation = skill_recommendation.replace({np.nan: None})
response = {'recommendations': skill_recommendation.to_dict(orient='records')}
return response
api.route(‘/skills_from_cv’)
marks our endpoint. So whatever URL Cloud Run gives our API, the place to send CVs and receive input will be like
https://url-cloud-run-gives-you/skills_from_cv
The api.expect(file_upload_parser)
decorator specifies that POST requests sent to this endpoint must contain a file specified in the argument parser earlier.
Lastly, some differences in the main code between recommned_without_cloudrun.py
and this script. In recommned_without_cloudrun.py
, we had 3 more arguments input_file
, master_skills_emb_binary
, and master_skills_list
. Let’s talk about where they went.
input_file
is parsed at the beginning of the POST function as below. In this API version, we parse the arguments within the main script. This input file is saved to a PDF file and read again using pypdf later, but this is because we couldn’t find a way to directly parse the contents of the PDF file from the argument.
args = file_upload_parser.parse_args()
# get the CV file from the arguments and save it as file.pdf
input_file = args['file']
As for master_skills_emb_binary
, and master_skills_list
, we no longer need to specify the location for these because we specify exactly where they will be in the container (more on that in the Dockerfile section). Unless you make changes in Dockerfile they will be placed in the same directory as main.py.
Adding documentation about the response
This part is not mandatory to run the API, but your devs will most likely want more information about what is going to be returned. So let’s add a little more documentation.
In the first bit of your code, after you define the object “api”, add this:
# response field documentation
recommendation_fields = api.model('Recommendations',{
'Skill': fields.String(description = "The skill name."),
'Score': fields.Float(description = "The match score (cosine similarity)."),
})
response_fields = api.model('Response', {
'recommendations': fields.List(fields.Nested(recommendation_fields),
description = "Skill names and match score for the recommended skills.\
The recommendations are in descending order by score.")
})
And also, add an extra decorator api_marshal_with
to the beginning of your main class.
# /skills_from_cv is the endpoint where the POST request is submitted
@api.route('/skills_from_cv')
class SkillsFromCV(Resource):
@api.marshal_with(response_fields, as_list=True)
@api.expect(file_upload_parser)
def post(self, top_k=5):
#rest of script
This will add more documentation to your interactive API interface after deployment. Please note that this is a very simplified version — you will need to add more documentation in your production version, like what error codes are returned and what they mean. For example, you can check that the file extension ends in “.pdf” and if not, return a “400: Invalid Request”. Flask_RESTX allows you to document error codes in a way similar to the request field.
Prepare Dockerfile and helper script
This is the file that has the instructions for building the container that holds our API. It gathers up the code and files that are needed to run the API, installs dependencies and libraries, and sets up the configuration for Cloud Run.
Deploying a Flask API on Google Cloud Run is an excellent choice due to its serverless architecture, which eliminates the need for managing servers and enables automatic scaling to handle varying traffic loads efficiently. This approach ensures a cost-effective solution as it follows a pay-as-you-go model, where you are billed only for the resources your application consumes. Furthermore, not only does Google manage part of the security, GCP also offers a solid framework for securing your application (more on this later). Cloud Run’s seamless integration with other Google Cloud services streamlines the development and deployment process, making it a robust and convenient platform for deploying modern web applications.
Our Dockerfile is split into 2 files, Dockerfile
and helper_script.sh
.
Dockerfile
Our Dockerfile looks like this:
#FROM python:3.10
FROM python:3.10-slim-bullseye
# make sure every file you need is inside the working dir of container
WORKDIR /usr/src/app
COPY main.py /usr/src/app/main.py
COPY model /usr/src/app/model
COPY master_emb_list.pkl /usr/src/app/master_emb_list.pkl
COPY master_skills_list.txt /usr/src/app/master_skills_list.txt
RUN apt-get -qqy update && apt-get install -qqy
# Install dependencies. Here ./ refers to WORKDIR, which is /usr/src/app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
# get start up file in place
COPY helper_script.sh /usr/src/app/helper_script.sh
RUN chmod +x helper_script.sh
# run start up file
CMD ["/usr/src/app/helper_script.sh"]
Like any other Dockerfile you come across, it starts building the container environment from an existing, official Docker image, which you can think of as a blueprint that organizations and professionals create and share publicly as a starting point of development. We use the Docker image for Python 3.10, python:3:10-slim-bullseye
.
After specifying the image we want to start from, we specify our WORKDIR, which is where the code will be executed in our container. Then we copy our main script, model, skill name list, and the skill name embeddings into our WORKDIR.
Then we start installing the necessary libraries for running the code using requirements.txt.
The helper script
The helper script called helper_script.sh
includes commands that should be run within the container after we set up and initialize it. This simple script only contains the command to deploy our code as an app on Cloud Run, but you can add other instructions to it, like mounting a Cloud Bucket that contains a resource that you need.
#!/usr/bin/env bash
# for debugging
# catch errors early and pick up the last error that occurred before exiting
set -eo pipefail
# gunicorn is the service that deploys the Flask app
# leave $PORT as an environment variable, never hardcode it, and GCP will pick up the port name
# main:app main is the name of the Python file that contains Flask app (main.py)
# app is the Flask app name I specified in main.py
exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app
# Exit immediately when one of the background processes terminate.
wait -n
What is next — deployment
Now that we have the code ready, it’s time to deploy the service to Google Cloud! We will walk you through the commands and setups you need to move your API to Cloud Run. See you in the next post!