Google Cloud Run Deploy and Elasticity Benchmark

Published in

Google Cloud - Community

5 min readApr 10, 2019

Intro

Google just announced Cloud Run, the new serverless solution to run containerized applications without managing the underlying infrastructure. Cloud Run fully manages the load balancing and auto-scaling of the service.

In this article, I will deploy a containerized application, and test it, focusing on:

How fast Cloud Run can deploy the service
How quick Cloud Run can scale to handle a large number of concurrent users

Deploy

Python Application

Let’s create and deploy a simple Python application for testing. The detailed instructions can be found here: https://cloud.google.com/run/docs/quickstarts/build-and-deploy

The application consists of 2 files: app.py and Dockerfile. It simply returns “Hello World!”.

helloworld/app.py

import osfrom flask import Flaskapp = Flask(__name__)@app.route('/')
def hello_world():
    target = os.environ.get('TARGET', 'World')
    return 'Hello {}!\n'.format(target)if __name__ == "__main__":
    app.run(debug=True,host='0.0.0.0',port=int(os.environ.get('PORT', 8080)))

Build the Container Image

I will utilize Cloud Build to build the image. Run this command in the same directory with the Dockerfile.

gcloud builds submit — tag gcr.io/jeanno-cloud-run-test/helloworld

The building time largely depends on the Dockerfile and the size of the application. For this simple Python application. It only took 24 seconds to build. So far it’s just a standard Docker image build, nothing fancy.

Deploy Time

At this point, I’ve already built a container image and it’s ready at:

gcr.io/jeanno-cloud-run-test/helloworld

Now, it’s getting interesting. With Cloud Run, I can quickly and easily spin up a service to a ready state in less than 30 seconds. I use the following command to deploy and measure the time needed.

time gcloud beta run deploy hello — image gcr.io/jeanno-cloud-run-test/helloworld — region=us-central1 — allow-unauthenticatedDeploying container to Cloud Run service [hello] in project [jeanno-cloud-run-test] region [us-central1]
✓ Deploying new service... Done.                     
  ✓ Creating Revision...
  - Routing traffic...
Done.
Service [hello] revision [hello-00001] has been deployed and is serving traffic at https://hello-3o53mu62aa-uc.a.run.appreal    0m21.006s
user    0m0.408s
sys     0m0.082s

After that I immediately clicked to open the link to open in the browser. It took ~2 seconds to load the page. From the start of deployment to ready, only ~23 seconds of time is needed.

P.S.: I could’ve included curl in a script and time it for a more accurate measurement but it’s ok since it shouldn’t have much difference.

Scale Out and Stress Test

Methodology

Now here’s the fun part. I have the service ready, it’s time to stress test it. I am going to perform the test using JMeter on a n1-standard-64 (64 vCPUs, 240 GB memory) Compute Engine instance in us-central1.

Since the Compute Engine instance is closer to the Cloud Run region, which is also in us-central1, the round trip time is much shorter so that higher requests per second can be achieved.

I use the following JMeter command to start the test.

jmeter -n -t test-plan.jmx -p jmeter.properties

In the test plan, I have set up a 5-second ramp up to 5000 concurrent users. Each user is represented by 1 thread and the thread will continue sending out HTTP requests one after another. The response is checked in JMeter, as specified in test-plan.jmx to make sure it has an HTTP status code of 200. The test will be running for 120 seconds.

Since Cloud Run scale to 0 when there is no active traffic, we can imagine it is an idle service handling a large amount of unexpected spike traffic and needs to scale out quickly to match the load.

I have also specified extra JMeter properties for it to give me a summary per 10 seconds instead of the default 30 seconds.

Result

After JMeter has finished running, a jmeter.log file has been generated. Let’s focus on these 3 metrics: Requests per second (RPS), average request time and maximum request time. Here’s a graph to visualize the result.

The requests per second exponentially ramped up to more than 60k in 40 seconds, which is very impressive. The avg. request time was slightly less than 1000ms during the ramp up and eventually went down to less than 100ms. The max. request time was 5000ms to 10000ms during the ramp up and dropped rapidly to less than 2000ms. The drop of RTS at the end was because of the ramp down of JMeter threads so we should ignore that part.

Only less than 1% error (713 response with HTTP status code != 200) occurred during the start of the ramp-up time (7.1–17.1s). And 1 or 0 error occurred in each of the other intervals.

Even during ramp up, a great majority of the requests finished successfully in less than 1000ms. Again, I didn’t configure anything to handle the spike. It is all fully managed by Cloud Run to provide such scalability.

Going Beyond

At this point, you might ask, “What is the limit of Cloud Run?”. Well, apparently I haven’t hit the limit yet.

As of writing, the maximum number of container instance is limited to 1000 in Cloud Run. Each container can handle up to 80 concurrent connections. A Cloud Run service can handle up to 80,000 concurrent connections. 60k RPS / 5,000 * 80,000 = 960k RPS. With a simple calculation, I theoretically expect my service can handle 960k RPS at max.

The RPS is greatly decided by the request process time. If each request takes 1 second to process, the service can only handle 80k RPS at max.

However, the maximum number of container instances can be increased through GCP support, meaning the real limit of Cloud Run really depends on regional capacity.

To carry out an experiment at a larger scale, I will need a distributed solution for load testing. I don’t have such setup for now. Let me know if you want to see this, we might even want to try breaking Cloud Run (As my colleagues are also eager to see this).

If you like this article or find it helpful, you can give me a few claps and follow me to get notified when my next article comes out!