Migrating to Google Cloud Run Serverless Container Platform

6 min readOct 7, 2019

In this article, I will share our journey of how we containerized and migrated our existing web application to Google Cloud Run which is a serverless platform for managing stateless containers. Further below I will be giving a brief description of cloud run and how cloud run is different from other FaaS solutions and why we are moving to this new serverless platform.

The application is CLA Assistant, which is an Open-Source Auditing Tool proudly provided and maintained by SAP.

So, What is Cloud Run?

The Google Cloud Run is a new serverless container platform introduced by GCP(still in beta)which abstracts away all the infrastructure and fully manages the auto-scaling and load balancing of the stateless application. In simple terms, you will be only paying for the resources you use. It brings the serverless agility to the containerized Apps.

The Cloud Run automatically scales up when there is more traffic for the application and it scales down to zero when there is no traffic. Hence, you literally won’t be paying even a single penny for the infrastructure when there is no traffic. Isn’t it cool, right ? 😊 .The cloud Run combines the best of the both worlds — containers with the serverless architecture.

Cloud Run Overview

Cloud Run is not FAAS

Please don’t confuse Cloud Run with the Cloud Functions which is a Functions-as-a-Service(FaaS) solution provided by Google. The FaaS solutions like Cloud Functions, AWS Lambda have a fixed concurrency set to 1 by design. Hence, for every request, a new instance will be created. However, On the contrary, one instance of the Cloud Run container can handle many requests concurrently (up to a maximum of 80). Depending upon the traffic for your application, you can tweak and set the optimum concurrency.

The following diagram shows how the concurrency setting affects the number of container instances needed to handle incoming concurrent requests:

You can also run Google Cloud Run as a FaaS by limiting the concurrency limit to 1. Hence, for each request, a new container instance is created, For example, if there are 50 requests then 50 container instances are created.

You should consider doing this in cases where:

Each request uses most of the available CPU or memory.
Your container image is not designed for handling multiple requests at the same time, for example, if your container relies on global state that two requests cannot share

However, I don’t appreciate this approach because, it will cause negative impact in the scaling performance because for each request, there will be a cold start of the container instance and so, application performance will be slow and you will end up paying more too, since you are using many container instances.

Why Moving to Serverless ?

In simple words, the term Serverless means zero infrastructure maintenance and you only pay for what resources you use. As the traffic and number of users for our application , is increasing day by day, However, on the other hand, It started to become very difficult to maintain and manage the resources for the application. we had to keep monitoring the health of the application to keep the application up and running and end of the day keeping our customers happy 😊 . Hence, we decided it is high time to move to the serverless architecture so that, we can invest the time and efforts on innovation and developing new features rather than spending time in maintaining and supporting the infrastructure of the application.

“Let’s focus on developing new features and not on Infrastructure”

But, why Cloud Run ?

As there are lot of serverless solutions around, you might be wondering why we have chosen Google Cloud Run Serverless Container Platform out of all other platforms. Our application is just an express App under the hood and In order to move to serverless FaaS solution like Cloud Functions, AWS Lambda, we have to rewrite and redesign the whole codebase which is quite cumbersome and we didn’t want to spend time and effort in rewriting the whole code base.

On the contrary, for migrating to Cloud Run Container serverless Platform, we didn’t have to rewrite even a single line of code. All we needed is just to include the DockerFile in order to package and containerize the whole application. A Dockerfile is just a text file that Docker reads in from top to bottom. It contains a bunch of instructions which informs Docker HOW the Docker image should be built.

Note: The Google Cloud Run only support stateless Applications.

Hence, It doesn’t really matter in which language, your existing application is written and in which environment, your existing application is running. Include the DockerFile to build and create a Docker Image of your application You can deploy this image to the Cloud Run by giving the concurrency limit and Memory to be allocated to each container instance and then, that’s it….

Sit back and relax while the Cloud Run will manage and handle the infrastructure and it behaves like a fancy Load Balancer. It automatically scales up and down from zero depending on traffic almost instantaneously, so you don’t have to worry about scale configuration from now on...

“With Cloud Run, you can build Applications in your favourite language, with your favourite dependencies and tools, and deploy them in seconds.”

What about Pricing ?

As I already mentioned in the article, you are only going to pay for the resources you use and rounded up to the nearest 100 millisecond. When setting concurrency higher than one request at a time, multiple requests can share the allocated CPU and memory. This will reduce the cost effectively as less number of instances is needed to handle the multiple requests.

This is the official Pricing Table provided by the Cloud Run.

The pricing table uses the *GB-second* unit. A GB-second means for example running a 1GB instance for 1 second, or running a 256 MB instance for 4 seconds

Billable Time

For one container instance, billable time starts when

The container instance is starting
At least one request is being handled by the container instance

According to the official documentation, You will be only billed for the CPU and memory allocated while a request is active on one container instance and rounded up to the nearest 100 milliseconds.

If a container instance receives many requests at the same time, billable time begins with the start of the first request and ends at the end of the last request, as shown in the following diagram:

Let’s Summarize

The Cloud Run is utilising the best of both the worlds — Containers and Serverless Environment

You don’t have to spend time and effort in provisioning and managing the servers anymore but Instead, you can focus only on writing code.

Since the Cloud Run is just a container under the hood, one can migrate to Cloud Run serverless platform by just introducing the DockerFile to the stateless application regardless of any language or operating system libraries. You don’t have to rewrite even a single line of code.

You are only going to pay for the resources you use and you will never have to pay for your over-provisioned resources. You will literally won’t be paying even a single penny if there is no traffic to your application.

Cloud Run behaves like a fancy Load Balancer. It automatically scales up and down from zero depending on traffic almost instantaneously.

And finally, You can deploy the new docker image within seconds and also there is zero down-time when deploying new revision

Reference

Cloud Run | Google Cloud

Run stateless containers on a fully managed environment or on Anthos. Cloud Run is a managed compute platform that…

cloud.google.com