App Engine Flex || Kubernetes Engine — ?

Deploying containerized apps 2 ways on GCP

Customers ask for our help in determining whether App Engine Flex(ible Environment) or Google Kubernetes Engine (GKE) is best-suited to their needs.

There is no universal answer and our close ties with our customers helps us determine the best answer for them for their needs. This post summarizes one, good approach that will help anyone gain evidence for an answer: to try both.

In this post, I will use an exemplar as a solution, deploy it to Flex and to GKE, and load-test both solutions. Thanks to the consistency provided by containers, we’ll have high confidence that the different experience with each platform is due to the platform and not our solution.

Let’s get started!

An Exemplar

Something web-y using a NoSQL store? That sounds about right to me. Fortunately, the App Engine Flex documentation includes a sample app that we can use (GitHub here). You can pick your language flavor; I’m going with Golang because I’ve been writing in Python and Java recently. We’ll use Cloud Datastore (and possibly another later on) for persistence.


You can get started for free with Google Cloud Platform (GCP). I’m using a Linux (Debian) machine and will show bash commands here. Everything will work from a Mac or Windows machine but your-mileage-may-vary as you’ll need to do some work to convert the commands.

mkdir -p ${HOME}/Projects/${PROJECT}
cd ${HOME}/Projects/${PROJECT}
gcloud projects create $PROJECT
gcloud alpha billing projects link $PROJECT \
# Enable Datastore
gcloud services enable \
# Enable Kubernetes Engine
gcloud services enable \

App Engine Flex

Let’s create an App Engine Flex application in our project. You may choose a GCP region that’s most convenient for you with the following command, the Cloud SDK will prompt you to select a region in which App Engine Flex is available:

gcloud app create --project=$PROJECT

If you know your preferred region already, you may specify it here:

gcloud app create --region=$REGION --project=$PROJECT

GCP will then provision the application for you:

WARNING: Creating an App Engine application for a project is irreversible and the region
cannot be changed. More information about regions is at
Creating App Engine application in project [${PROJECT}] and region [${REGION}]....done.                                                                                                           
Success! The app is now created. Please use `gcloud app deploy` to deploy your first app.

If you follow the instructions to clone the GitHub repo containing the sample, you should find yourself in a directory containing two files: app.yaml and datastore.go.

NB app.yaml is a configuration file for App Engine. This file is not used by Kubernetes Engine.

I’m a little pernickety and I prefer to create everything cleanly my way:

mkdir -p $HOME/Projects/$PROJECT/go/src/$GITHUB/aeoke
cd $HOME/Projects/$PROJECT/go/src/$GITHUB/aeoke

I’ll call out app.yaml. As mentioned, this provides config guidance to the App Engine service:

runtime: go
env: flex
min_num_instances: 1
#[START env_variables]
#[END env_variables]
NB Replace $PROJECT with your project ID.
NB GCLOUD_DATASET_ID is passed as an environment variable to the Golang runtime and accessed with os.Getenv(“GCLOUD_DATASET_ID”). This is a best practice for passing config to containerized apps.

Don’t forget to create datastore.go too and pull the dependencies:

go get ./...

All being well, you should then be able to deploy the app. You’ll need to deploy the app to benefit from the full majesty of it:

gcloud app deploy --project=$PROJECT
Successfully built b4efec18970b
Successfully tagged${PROJECT}/appengine/default.20171010t153004:latest
The push refers to a repository [${PROJECT}/appengine/default.20171010t153004]
bf419b41a797: Preparing
bf419b41a797: Pushed
Updating service [default]...done.
Deployed service [default] to [https://${PROJECT}]
You can stream logs from the command line by running:
$ gcloud app logs tail -s default
To view your application in the web browser run:
$ gcloud app browse

I’ve included some of the deployment details because you will see from the above that the deployment pushes a container to a repository. Specifically to${PROJECT}/appengine… This URL refers to GCP’s hosted container registry called Google Container Registry (GCR). We will reuse the image from this repository when we deploy to Kubernetes Engine.

You may wish to check Cloud Console to monitor the status of the app. Don’t forget to replace ${PROJECT} with your Project ID:${PROJECT}&serviceId=default

You should see something similar to this:

Cloud Console: App Engine “Services”

Once the app is deployed, you may access it by clicking the “default” service from Cloud Console, accessing the service directly via it’s URL (replace $PROJECT with your Project ID), or with the command ‘gcloud app browse’:

The Exemplar deployed to App Engine Flex

You should explore the Console.

You may be interested to see the Instances that support our app. We explicitly set min_num_instances to be “1” in the app.yaml and so, with insignificant load, there’s a singular instance supporting our app:${PROJECT}&serviceId=default
Cloud Console: App Engine “Instances”

Refreshing the page several times ensures there’s a goodly amount of data persisted in Cloud Datastore:${PROJECT}&kind=Visit
Cloud Console: Datastore
NB Every page refresh (GET) on our app will add another entity to the Datastore “Visit” Kind. The Golang (queryVisits) function queries and displays the 10 most-recent entries only.

As mentioned previously, the Flex deployment created a (Docker) container and persisted this using Google Container Registry. Let’s look at the Container Registry page of our project:${Project}/US/appengine/?project=${PROJECT}
Cloud Console: Container Registry

I’ve drilled down into the Registry to show more details. The image name is “appengine/default.[DEPLOYMENT-TIME]” and it has been given the “latest” tag. It is possible to more explicitly reference this image by its digest which includes a sha256 hash.

You may also find the image with a Cloud SDK command. We’ll use this image again in the Kubernetes Engine deployment so it may be useful to remember this:

gcloud container images list \${PROJECT}/appengine \

For the curious, Google Container Registry uses Google Cloud Storage (GCS) to store the image layers. You may investigate here:${PROJECT}

Google Kubernetes Engine (GKE)

Let’s start by creating a cluster on which we can deploy the Exemplar app.

For consistency, we’re going to use custom machine-types with 1 vCPU and 1GB of RAM as this is what App Engine Flex is using. We’ll start with 1 (worker) node. GKE manages the master node for us but the master is not used to run our containers. I recommend also using the same region (and preferably the same zone) as App Engine. In this case, I’m using us-east4 and App Engine used zone ‘c’. As with App Engine, I’m going to enable GKE to auto-scale BUT we’re (usefully) required to provide a maximum number of nodes as an upper-bound on the auto-scaler. I’ve chosen 10 nodes here but you may wish to use a lower number.

export ZONE=${REGION}-c
gcloud container clusters create $CLUSTER \
--enable-kubernetes-alpha \
--project=$PROJECT \
--zone=$ZONE \
--machine-type=custom-1-1024 \
--image-type=COS \
--num-nodes=1 \
--enable-autoscaling \
--max-nodes=10 \
NB GKE provides two flavors of auto-scaling. The first is an intrinsic feature of Kubernetes and permits more pods to be created as load on a service is increased. The number of nodes forming the cluster remains fixed. The second (specified by — enabled-autoscaling) uses Cluster Autoscaler. As the name suggests, we now permit the cluster to grow (and to shrink) as demand upon it changes. This results in additional nodes being added to the cluster to grow it and nodes being removed from the cluster to shrink it. With more nodes, there’s more capacity to run a greater number of pods.

Once the cluster is created, you may observe it from within Cloud Console:${PROJECT}

In order to control the cluster from the command-line, we need authenticate to it. This is facilitated with a Cloud SDK (gcloud) convenience command:

gcloud container clusters get-credentials $CLUSTER \
--zone=$ZONE \

To check that everything’s working correctly:

kubectl get nodes
NAME                                        STATUS    AGE
gke-cluster-01-default-pool-001b0e59-8dk8 Ready 56s

Optional: You may control the cluster using the Kubernetes Dashboard. GKE deploys the Dashboard with clusters. To access the Dashboard, configure a proxy to the API server and then open the URL in your browser. I use port=0 to gain an available port at random. In this case, the port chosen was 42545. You should use whichever port is provided to you when you run the command:

kubectl proxy --port=0 & 
Starting to serve on

Once the proxy is running, you can access the API server on the root (“/”) and the UI Dashboard on “/ui”:


There’s a small issue for me presently where the UI does not render correctly so I’m going to provide examples with Cloud Console and from the command-line :-(

Update: the UI issue is being addressed. You should be able to access the UI by explicitly finalizing the URL after redirection with a “/” so:
Kubernetes UI Happiness!

I’ll add some sample screenshots from this UI to the end of this post.

From the App Engine Flex deployment, we have an existing image in GCR that we can reuse. The easiest way to spin this into a service on GKE is to reference the image from a Deployment, expose the Deployment as a Service and then have GCP create an HTTP/S Load-Balancer.

Please review your App Engine Flex deployment to determine the name of the container image that App Engine Flex created for you. You will need to reference the image in the Deployment YAML file:$PROJECT/appengine/default/YYMMDDtHHMMSS:latest

Alternatively you may find the image name with the following command. The version tagged “latest” will be the one we’ll use. So please don’t forget to append “:latest” to the image name when we create the Deployment config:

gcloud container images list \${PROJECT}/appengine

Before we can create the Deployment, we need to create a service account and a key, and assign it permission to access Cloud Datastore. We must then upload the key as a secret to GKE. This way, the Exemplar app’s pods may access the key when they need to access Cloud Datastore.

It sounds complex (and is more complex than it ought be) but there’s a straightforward pattern that’s documented here for Cloud Pub/Sub. I’ve tweaked this only slightly for Cloud Datastore:

export ROBOT="gke-datastore"
gcloud iam service-accounts create $ROBOT \
--display-name=$ROBOT \
gcloud iam service-accounts keys create ./key.json \
--iam-account=${ROBOT}@${PROJECT} \
gcloud projects add-iam-policy-binding $PROJECT \
--member=serviceAccount:${ROBOT}@${PROJECT} \
kubectl create secret generic datastore-key \

We can now define a Deployment combining the GCR image name, the service account key created in this previous step and the environment variables sufficient to define the Exemplar app.

Create a file (I’m using datastore-deployment.yaml). Replace ${IMAGE} with the path to your image ($PROJECT/appengine….) and replace ${PROJECT} with your Project ID.

The only real complexity in this configuration is that it exposes the Secret created in the previous step as a file that can be referenced by the pod through an environment variable (GOOGLE_APPLICATION_CREDENTIALS). This powerful mechanism is called Application Default Credentials:

apiVersion: apps/v1beta1
kind: Deployment
name: datastore
replicas: 1
app: datastore
- name: google-cloud-key
secretName: datastore-key
- name: datastore
image: ${IMAGE}
- name: http
containerPort: 8080
protocol: TCP
- name: google-cloud-key
mountPath: /var/secrets/google
value: /var/secrets/google/key.json
value: ${PROJECT}

We can now create the Deployment:

kubectl create --filename=datastore-deployment.yaml

All being well, you should be told:

deployment "datastore" created

Next, let’s add a (Horizontal) Pod Autoscaler to the Deployment to permit it to autoscale the number of pods when a CPU threshold (80%) is reached:

kubectl autoscale deployment/datastore --max=20 --cpu-percent=80

You should be able to:

kubectl get deployments
datastore 1 1 1 0 2m
kubectl get replicasets
NAME                   DESIRED   CURRENT   READY     AGE
datastore-3517606568 1 1 0 2m
kubectl get pods
NAME                         READY     STATUS    RESTARTS
datastore-3517606568-8ffn2 1/1 Running 0

You may also observe this Deployment using Cloud Console:

Cloud Console: Kubernetes Engine “Workloads”

We now need to add a Service ‘veneer’ to this deployment and expose the result using an Ingress on an HTTP/S Load Balancer. We can do this simply from the command-line:

kubectl expose deployment/datastore \
--type=NodePort \
--port=9999 \

Finally, we can create an Ingress. This will create an HTTP/S Load-Balancer on GCP that points to our service and … all being well… should permit us to access our former Flex-only service as a newly-deployed GKE-service.

Create a file (I’m using datastore-ingress.yaml):

apiVersion: extensions/v1beta1
kind: Ingress
name: datastore
serviceName: datastore
servicePort: 9999

and then create the Ingress:

kubectl create --filename=datastore-ingress.yaml

Once the Ingress reports an external address, you should be able to access the service using it. In my case (yours will be different) the public IP address is

kubectl get ingress/datastore
NAME        HOSTS     ADDRESS           PORTS     AGE
datastore * 80 6m

You can view the Ingress multiple ways:

Cloud Console: Kubernetes Engine “Load Balancing”

You can also check “Network Services” where you will see (probably) 2 HTTP/S Load-Balancers created. One was created by App Engine Flex (and customarily called “aef-um”. The second was created by GKE by the Ingress (and customarily called something “k8s-um-default….”):${PROJECT}
Cloud Console: Network Services “Load Balancing”

NB the IP:Port defined here matches (as you would expect) the IP address provided by describing the datastore Ingress. Your IP address and other details will be different but your IP address is the you should use:

We took the image created by the App Engine Flex deployment and reused it in a Deployment to GKE. Once deployed, we exposed the Deployment as an HTTP/S Load-Balancer using GKE’s Ingress.

We now have 2 deployments of the same container and can load-test them to see how each service performs under load.

App Engine Flex deployed the container image from Container Registry to Compute Engine VMs auto-scaled by a Managed Instance Group and exposed through an HTTP/S Load-Balancer.
Kubernetes Engine deployed the container image from Container Registry to Compute Engine VMs auto-scaled by a Managed Instance Group and exposed through an HTTP/S Load-Balancer.
The underlying resources (correctly) are the same for both services.
Both services use declarative (intentional) configuration.
An important difference between the services is that App Engine Flex biases automation to Google’s control whereas Kubernetes Engine requires more oversight by the customer. Kubernetes Engine is evolving more rapidly and is adding more powerful automation.
A subtle difference is that Flex uses containers as a means to an end. Customarily, users of Flex could ignore that containers are being employed because this is done behind the scenes. Kubernetes Engine — as the name suggests — is predicated on containers and is explicitly designed as a tool that facilitates the management of services built from containers. With Flex, a service is always n-containers of one type. With Kubernetes Engine, a service comprises m-pods and the pods may themselves comprise p-containers.


The more astute than I among you, will realize that, as we consider load-testing, I’ve introduced a discrepancy. While App Engine Flex is behind TLS, the Kubernetes Engine App is (currently) not. Let’s fix that!

There are many ways to achieve this goal but this approach is the easiest. We will need to create a certificate, upload this as a Secret to GKE and then revise the Ingress to reference it. I assume you have a domain that you may use. I will use Cloud DNS.

Let’s start by deciding upon a name for the GKE app. I will use “” and I alias this to the IP address of the HTTP/S Load-Balancer created by the GKE Ingress:

Cloud Console: Network Services “Cloud DNS”
export NAME=[YOUR-DNS-NAME] // Mine is
mkdir -p $HOME/Projects/$PROJECT/certs
cd $HOME/Projects/$PROJECT/certs

If you’re using Google Cloud DNS, your DNS changes will be most quickly accessible through Google’s Public DNS and you may query it with:

nslookup ${NAME}
Non-authoritative answer:
Name: ${NAME}
Address: ${IP} // The IP address of the HTTP/S Load-Balancer

Now that we have an DNS name, we can use openssl to generate a certificate to test. This is *not* what you should do in production. I recommend Let’s Encrypt or other cert authority.

openssl req \
-x509 \
-nodes \
-days 365 \
-newkey rsa:2048 \
-keyout ${NAME}.key \
-out ${NAME}.crt \
-subj '/CN=${NAME}'

Then we can use this bash goodness to base64 encode and then upload the ‘key’ and ‘crt’ files as a GKE Secret named as a our DNS name:

echo "
apiVersion: v1
kind: Secret
name: ${NAME}
tls.crt: `base64 --wrap 0 ./${NAME}.crt`
tls.key: `base64 --wrap 0 ./${NAME}.key`
" | kubectl create --filename -

And, lastly, we need to tweak the Ingress to include the certificate by referencing the Secret:

Open your Ingress config (I’m using ‘datastore-ingress.yaml’), replace $NAME with your DNS name, save it:

apiVersion: extensions/v1beta1
kind: Ingress
name: datastore
- secretName: ${NAME}
serviceName: datastore
servicePort: 9999

And then — you’ll get a warning but you may ignore it —

kubectl apply --filename=datastore-ingress.yaml

If you then refresh the Cloud Console page showing the Load-Balancers. You should see an “HTTPS” frontend added to the GKE Load-Balancer:

Cloud Console: Network Services “Load Balancing”

All being well, you should now be able to access the Exemplar solution on GKE via TLS:

curl --insecure https://${NAME}

OK. Let’s put some load on each of these services and see what happens. You may use Apache’s benchmarking tool “ab” but, I’m going to use ‘wrk’ (link):

cd $HOME/Projects/$PROJECT/
git clone
cd wrk && make
Usage: wrk <options> <url>
-c, --connections <N> Connections to keep open
-d, --duration <T> Duration of test
-t, --threads <N> Number of threads to use

-s, --script <S> Load Lua script file
-H, --header <H> Add header to request
--latency Print latency statistics
--timeout <T> Socket/request timeout
-v, --version Print version details

Numeric arguments may include a SI unit (1k, 1M, 1G)
Time arguments may include a time unit (2s, 2m, 2h)

Let’s start with App Engine Flex:

./wrk \
--threads=10 \
--connections=250 \
--duration=60s \

As with any load-test, it pays to run the same test several times. Here’s my first set of results. The top-line is (for this run) 650 RPS (μ=380ms δ=77ms)

Running 1m test @ https://${PROJECT}
10 threads and 250 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 380.96ms 76.89ms 1.03s 83.37%
Req/Sec 66.71 31.44 240.00 64.34%
39336 requests in 1.00m, 71.94MB read
Requests/sec: 654.64
Transfer/sec: 1.20MB

And then, the only difference in the command for GKE is to use ${NAME}:

./wrk \
--threads=10 \
--connections=250 \
--duration=60s \

And the results. The top-line is (for this run) 1420 RPS (μ=175ms δ=20ms):

Running 1m test @
10 threads and 250 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 175.39ms 19.73ms 1.17s 82.62%
Req/Sec 143.15 27.74 232.00 68.37%
85459 requests in 1.00m, 62.99MB read
Requests/sec: 1422.64
Transfer/sec: 1.05MB

For this first run, GKE has double the throughput of Flex (half the latency) and a much (5x) tighter distribution of latency.

Ran the tests a second time and grabbed monitoring…10 minutes, 25 threads and 250 connections…

App Engine Flex achieved 1740 RPS (μ=150ms δ=80ms)

./wrk \
--threads=25 \
--connections=250 \
--duration=600s \
Running 10m test @ https://{$PROJECT}
25 threads and 250 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 148.86ms 82.94ms 1.98s 82.34%
Req/Sec 70.52 29.78 130.00 59.61%
1045679 requests in 10.00m, 1.87GB read
Requests/sec: 1742.51
Transfer/sec: 3.19MB
Stackdriver Monitoring: App Engine

It’s non-trivial (for me?) to produce equivalent metrics for GKE but, here’s my best effort. GKE achieved 1350 RPS (μ=180ms δ=20ms):

./wrk \
--threads=25 \
--connections=250 \
--duration=600s \
Running 10m test @ https://${NAME}/
25 threads and 250 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 184.78ms 22.99ms 1.27s 86.09%
Req/Sec 54.26 16.20 101.00 78.38%
812839 requests in 10.00m, 599.13MB read
Socket errors: connect 0, read 1, write 0, timeout 0
Requests/sec: 1354.51
Transfer/sec: 1.00MB
Cloud Console: Network Services “Load Balancing”
Stackdriver custom dashboard
Cloud Console: Compute Engine “VM Instances”
Cloud Console: Kubernetes Engine “Workloads”

With GKE I’m receiving notifications from the Load-Balancer that “Usage is at capacity” which surprises me. GKE is not adding nodes to the pool (which I think it should) … ah, I’m just impatient… bumping VMs to 3 and Pods to 8:


  • It is practical to migrate an App Engine Flex deployment to GKE
  • In this case (!) Flex achieved greater throughput than GKE.
  • The increased velocity appears due to the rapidity with which App Engine is able to signal auto-scaling events; GKE scales pods promptly within an existing cluster of nodes but slightly more slowly to scale up the number of nodes.
  • App Engine and GKE share fundamental GCP resources including the HTTP/S Load-Balancer service and Managed Infrastructure Groups auto-scaling.
  • For the same load, using the same VM size (1 vCPU and 1GB RAM): App Engine Flex scaled to 6 containers on 6 instances VMs (1 instance/VM ); GKE scaled to 10 pods (1 container/pod) on 3 VMs (50%).
  • I’m still working on better ways to provide comparable monitoring.

Kubernetes UI

There’s an interim hack to access the Kubernetes Dashboard. Add a final “/” to the URL that you’re redirected to by the proxy. Then:

Kubernetes Dashboard

The Dashboard provides a Kubernetes-specific UI and I’m a fan.

Here you can see the Cluster is putting pressure on GKE to autoscale… 6/8 pods and blocking on CPU:

Kubernetes Dashboard: Waiting for Cluster Autoscaling
Kubernetes Dashboard: Scaled

In this second snapshot the Cluster has scaled (now at 3 GCE VMs) and able to sustain the load with 8 pods.

I’m going to investigate but — I assume — this mean 80% of the cluster’s (aggregate) CPU which corresponds to the (Horizontal) Autoscale requirement to scale on 80% CPU:

Kubernetes Dashboard: CPU usage


Kubernetes Dashboard: Memory usage


Last word: I’m trying to find an equivalent way to present the measurement of the GKE service:

Stackdriver: Custom Dashboard