Deploy highly scalable Machine Learning model | Cloud Run

Vaibhav Malpani
Vaibhav Malpani’s Blog
3 min readJun 1, 2020

In recent times, Machine Learning applications are growing a lot. At the same time, people that access these applications are growing too. So to handle such dynamic load, you need to have servers that are secure, scalable, and reliable. Cloud Run provides this and many more functionalities.

Objective:

To deploy your Machine Learning/Deep Learning model which has the following capabilities:

  • Automatic scaling up and down
  • Highly reliable and redundant
  • out-of-the-box stable HTTPS endpoint
  • Pay per use
  • Custom domain

Pre-requisites:

  • Basic understanding of Docker
  • Google Cloud Platform account.
  • A Machine Learning model already built and saved in a file.

Before we begin,

Make sure you have a Machine Learning model that is saved in a file.

Also, you should have the code ready that takes the input parameter required by the model, read the model file in your code, give the input parameter to the model and the model returns back the output.

As there are 100’s of ways to read a model, give input to the model, and get a response from a model, I will show below a template of what your code should look like:

def run_model(input):
model = read_model('./model_file')
result = model.get_result(input)
return result

For this blog,

I have decided to show a demo by using Python Flask Web Framework as python is mostly used for machine learning, it will be useful for a greater audience. But you can use any language that you want to write the above code.

Once you have written the above code, keep it in folder which contains the above code, your model, and requirements.txt

Now make a new file and name it Dockerfile. Below is the example that you can use if you are using python. (Make sure the name of the file that you have written above is app.py to use the exact code. else you need to make changes in the last line of the code.)

FROM python:3.7# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . .
# Install production dependencies.
RUN pip install gunicorn
RUN pip install -r requirements.txt
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn — bind :$PORT — workers 1 — threads 8 app:app

That’s it. That’s all the code we needed to write.

After this, set up a GCP project and download, install, and setup gcloud on your local machine. here

After that, just use the below-given commands.

  1. Gcloud builds: This build an image of the code and model.
gcloud builds submit --tag gcr.io/PROJECT-ID/helloworld

2. Gcloud run deploy: This pushes the locally built image to google cloud container registry. It also deploys this image on cloud run that runs in a serverless manner.

gcloud run deploy --image gcr.io/PROJECT-ID/helloworld --platform managed

After this go to https://console.cloud.google.com/run

You can see, your image is deployed. Click on it to get more details. There on the top middle, you can see the URL at which your model is deployed. Try sending a request to this URL and see if you get the expected output. If not, try checking the logs that you can see just below the URL.

Well, that’s it. Your model is now running self-managed, highly scalable servers.

If you liked this post, please clap for it; follow me if you want to read more such posts!

Twitter: https://twitter.com/IVaibhavMalpani
LinkedIn: https://www.linkedin.com/in/ivaibhavmalpani/

--

--

Vaibhav Malpani
Vaibhav Malpani’s Blog

Google Developer Expert for Google Cloud. Python Developer. Cloud Evangelist.