Container-Optimized OS

Daz Wilkin
Google Cloud - Community
11 min readApr 13, 2018

Thanks to my colleague for getting me set on using mounts with Container-Optimised OS.

If you’ve read any of my recent posts, I follow a consistent path:

  • go run...
  • go build...
  • docker run ...
  • kubectl apply ...

But, there’s an alternative step that may be useful and , while I’m using Golang here, this is applicable to any code you can containerize:

  • gcloud compute instances create-with-container ...

Google Cloud Platform (GCP) provides a Container-Optimized OS (aka “COS”) that may be used on Google Compute Engine and is the default image used by Kubernetes Engine (nodes). COS is based on Chromium OS. Chromium OS is a minimal Linux OS focused on security. Whereas Chromium OS is focused on web-browsing, COS is focused on running containers.

One reason I’ve not used COS is that it’s been more useful for me to focus on building a competence in Kubernetes Engine to help my customers but I was always challenged to map some of my more complex container environments to COS.

This post aims to help you consider COS as a way to deploy your containers to GCP when you may not require the more powerful features of Kubernetes

Setup

One of my more popular posts is “App Engine Flex || Kubernetes Engine — ??”. In it, we use a Google Golang sample app that talks to Cloud Datastore and we containerize the app and deploy it to Flex and Kubernetes. In this post, we’ll deploy that app to COS.

This is interesting because it demonstrates how you may translate volume|bind mounting to COS and how you may use Application Default Credentials with COS.

Please follow the setup from the post referenced above to download the sample app. You need *not* enable Kubernetes. You *must* (!?) create an App Engine app because of a limitation in GCP when using Cloud Datastore:

gcloud app create --region=us-central --project=${PROJECT}

Local-local

If you can get to a point where:

GCLOUD_DATASET_ID=${PROJECT} \
go run main.go

runs, and hitting its server’s endpoint results in something like this:

curl localhost:8080
Previous visits:
[2018-04-12 12:25:30.587515 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:25:29.504421 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:25:06.727592 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:25:04.619761 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:25:03.801524 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:25:00.902566 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:24:59.831409 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:24:56.524133 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:24:55.978291 -0700 PDT] 127.0.0.1:80
[2018-04-12 12:24:53.472835 -0700 PDT] 127.0.0.1:80
Successfully stored an entry of the current request.

Or:

Cloud Datastore: Entities

We’re good ;-)

NB The code as is published requires an environment variable GCLOUD_DATASET_ID to be set to our GCP project ID. The setting will be reflected in every invocation of this code below.

Application Default Credentials

A better way to authenticate our code is to use Application Default Credentials (ADCs). ADCs enable us to authenticate using a service account and — while generally a good practice — using a service account will be a requirement once for the app once it’s containerized.

The following (boilerplate) creates a service account, downloads a key for it, and authorizes (!) the account to use Cloud Datastore:

export ROBOT="datastore"
export EMAIL=${ROBOT}@${PROJECT}.iam.gserviceaccount.com
gcloud iam service-accounts create $ROBOT \
--display-name=$ROBOT \
--project=$PROJECT
gcloud iam service-accounts keys create ./${ROBOT}.key.json \
--iam-account=${ROBOT}@${PROJECT}.iam.gserviceaccount.com \
--project=$PROJECT
gcloud projects add-iam-policy-binding $PROJECT \
--member=serviceAccount:${EMAIL} \
--role=roles/datastore.user

These steps were used previously as part of the recommended way to create service account keys to be used by Kubernetes. In that case, we were able to upload the key to Kubernetes as a secret and then securely reference this key from the container using a volume mount.

All being well, you should be able to rerun the app authenticating as the service account and it should continue to work. If it doesn’t immediately work, wait for a few seconds to permit the account and its permissions to propagate before trying again:

GCLOUD_DATASET_ID=${PROJECT} \
GOOGLE_APPLICATION_CREDENTIALS=${ROBOT}.key.json \
go run main.go

NB The *only* change is that we’ve added another environment variable GOOGLE_APPLICATION_CREDENTIALS that is used by the Google library(/-ies) in our code for Application Default Credentials. We did not need to change the code to change the credentials.

Please ensure you terminate that before you attempt to run the Docker container. If you don’t, you’ll be told that the port is already in use.

Docker

So, this step may be uninteresting but it’s a pre-req.

We’re going to build a static binary and containerize it. Because the binary is now confined to — in this case, Docker’s — container runtime, it is unable to access our gcloud credentials and so from now on, we must use the service account to authenticate.

Build your binary:

CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o datastore

Test again:

GCLOUD_DATASET_ID=${PROJECT} \
GOOGLE_APPLICATION_CREDENTIALS=${ROBOT}.key.json \
./datastore

NB The only difference here is we’ve swapped our go run main.go with the binary.

Dockerfile:

FROM scratchLABEL maintainer="Your Name <your@email.com>"ADD ca-certificates.crt /etc/ssl/certs/ADD dumb-init /
ADD datastore /
ENTRYPOINT ["/dumb-init","--"]
CMD ["/datastore"]

NB Please review my previous posts for guidance on dumb-init and why “ca-certificates.crt” is present.

Build your docker image:

docker build --tag=gcr.io/${PROJECT}/datastore .

If you’ve not used it (or ignored it) recently, gcloud docker -- push now recommends a return to pure docker cli commands, you may need to:

gcloud auth configure-docker

Before:

IMAGE=datastore
docker push gcr.io/${PROJECT}/${IMAGE}

When the push completes successfully, it will culminate in a digest for the image, please capture that digest sha256:... inclusive:

...
latest: digest: sha256:f70dd40d260d6c79fdf79c678e468302c0a18084b830ea2680cbb14c433c1749 size: 947

If — like me — you forgot a step, you can:

cloud services enable containerregistry.googleapis.com \
--project=${PROJECT}

and try the push again ;-)

OK… ensure the go run main.go command is *not* running and:

docker run \
--interactive \
--tty \
--publish=127.0.0.1:8080:8080 \
--env=GCLOUD_DATASET_ID=${PROJECT} \
--env=GOOGLE_APPLICATION_CREDENTIALS=/tmp/${ROBOT}.key.json \
--volume=$PWD/${ROBOT}.key.json:/tmp/${ROBOT}.key.json \
gcr.io/${PROJECT}/datastore

NB the two -env flags set the environment variables as before. Additionally we must provide the container with a way to access the key stored on the host. This is effected with the --volume mapping.

You should continue to be able to curl the endpoint successfully.

One Container-Optimized OS VM

Let’s now deploy this image to COS and run it there. We’ll need to do (1) create COS VM configured to pull the image from GCR and; (2) copy the key to the VM. If you’ve not enabled Compute Engine in the project:

gcloud services enable compute.googleapis.com --project=${PROJECT}

Then:

DIGEST=[[ The digest from your push ]]
INSTANCE=datastore-cos
ZONE=us-west1-a
gcloud beta compute instances create-with-container ${INSTANCE} \
--zone=${ZONE} \
--image-family=cos-stable \
--image-project=cos-cloud \
--container-image=gcr.io/${PROJECT}/${IMAGE}@${DIGEST} \
--container-restart-policy=always \
--container-env=\
GCLOUD_DATASET_ID=${PROJECT},\
GOOGLE_APPLICATION_CREDENTIALS=/tmp/${ROBOT}.key.json \
--container-mount-host-path=\
mount-path=/tmp,\
host-path=/tmp,\
mode=rw \
--project=${PROJECT}

NB this is a gcloud beta command. We’re mounting /tmp on the COS VM into our container instance. You may be prompted to enable API [compute.googleapis.com] if you’ve not used it previous, please accept (Y) and be aware that you will be charged while the COS VM instance is running.

NB There are more flags to run this command because we’re creating a COS VM and running our container image on it. But, you can still see our original environment variables GCLOUD_DATASET_ID and GOOGLE_APPLICATION_CREDENTIALS reflect in this command. As before, we must map our service account into the (remote) container. This is enabled with --container-mount-host-path but this does *not* copy the key to the VM. See below for that.

NB We do not explicitly expose port 8080 when using COS. Containers on COS are effectively run with --net=host . This presents the containers’ (multiple) ports (multiple) as if they were directly exposed by the host. In our case, when we port-forward to the host’s (!) port 8080 our container is accessible through it. A downside to this is that your COS containers may not conflict in their port use.

All being well, we’ll get a confirmation from GCP that the instance is created and running. Hopefully our container is running too. However, before it will run correctly, we must copy the service account to the VM so that the container may access it through the volume mount. From your working directory:

gcloud compute scp \
${ROBOT}.key.json \
${INSTANCE}:/tmp \
--project=${PROJECT}

How to tell? Let’s ssh onto the image and check. While we ssh, let’s also port-forward port 8080 so that we may access the service:

gcloud compute ssh ${INSTANCE} \
--ssh-flag="-L 8080:localhost:8080" \
--project=${PROJECT}

Once this command succeeds, from another ssh session (!), try curling the endpoint using localhost:8080 (localhost thanks to the port forward).

A useful way to observe the state of a (COS) VM is by grabbing the serial console output:

gcloud compute instances get-serial-port-output ${INSTANCE} \
--zone=${ZONE} \
--project=${PROJECT} \
| grep konlet-startup

You should see success messages including pull complete, Create a container..., Starting a container... etc.

From the instance’s (!) ssh session, you may use the Docker CLI:

docker container ls

and it should have an Up.. status:

CONTAINER ID    IMAGE                                   STATUS
d06da5bee1f0 gcr.io/${PROJECT}/datastore@${DIGEST} Up 1 minutes

You won’t be able to use docker logs... but you can use journalctl to achieve similar (and more broader) purposes:

sudo journalctl --unit=konlet-startup \
| grep ${INSTANCE}

NB You’ll need to recreate ${INSTANCE} if you’d like to filter the journalctl logs by it *but* unless you’re running multiple containers on the COS VM, this will be redundant.

Instance Groups: Many Container-Optimized OS VMs

One interesting side-effect of using COS VMs is that these can be templated then used to create Instance Groups. As the name suggests, these are groups of (cloned) instances. So, instead of creating a COS VM directly, we create a template instead and then create an instance group to stamp out X (let’s do 3) clones for us:

REGION=us-west1gcloud beta compute instance-templates create-with-container ${INSTANCE}-template \
--image-family=cos-stable \
--image-project=cos-cloud \
--container-image=gcr.io/${PROJECT}/${IMAGE}@${DIGEST} \
--container-restart-policy=always \
--container-env=\
GCLOUD_DATASET_ID=${PROJECT},\
GOOGLE_APPLICATION_CREDENTIALS=/tmp/${ROBOT}.key.json \
--container-mount-host-path=\
mount-path=/tmp,\
host-path=/tmp,\
mode=rw \
--region=${REGION} \
--project=${PROJECT}

NB The *only* difference here is that I’ve decided to use a regional instance group and so we replace --zone=${ZONE} with --region=${REGION} and add a setting for REGION.

Bring On The Clones:

CLONES=3gcloud compute instance-groups managed create ${INSTANCE}-group \
--base-instance-name=${INSTANCE} \
--template=${INSTANCE}-template \
--size=${CLONES} \
--region=${REGION} \
--project=${PROJECT}

And — very — quickly:

gcloud compute instance-groups managed list-instances ${INSTANCE}-group --region=$REGION --project=$PROJECTNAME                ZONE        STATUS   ACTION  LAST_ERROR
datastore-cos-qnkg us-west1-a RUNNING NONE
datastore-cos-q5wd us-west1-b RUNNING NONE
datastore-cos-3t60 us-west1-c RUNNING NONE

NB Not only do I get 3 clones of my template but they’re peanut-buttered across the zones in us-west1. Sweet!

But, I’ve created a problem for myself. I now have x (where x could be large) instances of my container running and *each* of them expects the service account key to be present in its hosts (mapped to its) /tmp. We could cheat:

CLONES=$(gcloud compute instance-groups managed list-instances ${INSTANCE}-group --region=$REGION --project=$PROJECT --format="value(instance)")for CLONE in ${CLONES}
do
gcloud compute scp \
${ROBOT}.key.json \
${CLONE}:/tmp \
--project=${PROJECT}
done

But, one benefit of instance groups is the possibility to auto-scale them and that unhealthy clones will be whacked and replaced. In either case, the new clone won’t have access to the service account key.

What we really want is a singular source of the key that each container is able to proactively reference. Thoughts? Persistent Disk is too over-wrought. Here’s a solution using Google Cloud Storage (GCS).

Let’s create a (regional) bucket and plonk our key in it. The key is accessible to any of this project’s users:

BUCKET=[[YOUR-BUCKET-NAME]]gsutil mb -c regional -l ${REGION} -p ${PROJECT} gs://${BUCKET}
gsutil cp ./${ROBOT}.key.json gs://${BUCKET}/

Then, we can revise the instance template to run a startup-script that acquires the key for the containers.

startup.sh:

You’ll need to delete the existing template (and instance group) or revise the names:

gcloud beta compute instance-templates create-with-container ${INSTANCE}-template \
--region=${REGION} \
--image-family=cos-stable \
--image-project=cos-cloud \
--container-image=gcr.io/${PROJECT}/${IMAGE}@${DIGEST} \
--container-env=\
GCLOUD_DATASET_ID=${PROJECT},\
GOOGLE_APPLICATION_CREDENTIALS=/tmp/${ROBOT}.key.json \
--container-mount-host-path=\
mount-path=/tmp,\
host-path=/tmp,\
mode=rw \
--project=$PROJECT \
--metadata-from-file=startup-script=./startup.sh

NB Ugh. This startup-script is more complex than it ought be. We must go around-the-houses a little to acquire an access token for the COS VM’s service account and then use it to pull the service account key from GCS and store this, as expected by the container, in /tmp/${ROBOT}.key.json.

NB Please replace [[YOUR-BUCKET]] and [[YOUR-KEY-FILE]] with the actual values.

And then create an instance group using the same command as before:

gcloud compute instance-groups managed create ${INSTANCE}-group \
--base-instance-name=${INSTANCE} \
--template=${INSTANCE}-template \
--size=${CLONES} \
--region=${REGION} \
--project=${PROJECT}

And — this time — the containers should be good to go. Let’s pick one at random:

NODES=$(\
gcloud compute instance-groups managed \
list-instances ${INSTANCE}-group \
--region=$REGION \
--project=$PROJECT \
--format="value(instance)")
RANDOM_NODE=$(shuf -n1 -e ${NODES})gcloud compute ssh ${RANDOM_NODE} \
--project=${PROJECT} \
--ssh-flag="-L 8080:localhost:8080"

And, hopefully, you can curl http://localhost:8080 successfully :-)

If you experience problems, I recommend you ensure that you’ve copied the startup script correctly. You may check an instance’s serial console although this doesn’t report the startup-script. From a COS VM, you can check:

sudo journalctl | grep startup-script

And for the existence of the service account key in a VM’s /tmp directory.

It’s straightforward (yay!) and I’ll leave it to you to create an HTTP Load-Balancer for our Instance Group.

Kubernetes

Oh, alright then ;-)

This is not the focus of this post but I include it here for completeness and because I hope, by doing so, it sheds light of the coherence with all of these approaches.

Assuming you have “a Kubernetes” running, we will create a namespace, upload the service key as a secret, run the container transforming the key in the secret into a volume mount. It’s better to do this entirely with YAMLs but to avoid complexity with the keys:

NAMESPACE=datastore
kubectl create namespace ${NAMESPACE}
kubectl create secret generic ${ROBOT} \
--from-file=${ROBOT}.key.json=${PWD}/${ROBOT}.key.json \
--namespace=${NAMESPACE}

NB With Kubernetes secrets are a more elegant way of hosting secure data (in a namespace) in a cluster.

Here’s the deployment:

NB Don’t forget to replace occurrences of ${PROJECT} with the value of ${PROJECT} before applying this. I’m assuming you used datastore for the value of ${ROBOT} but, if you did not, please change datastore.key.json to reflect the correct value.

NB In lines 27–31 you will see our environment variables GCLOUD_DATASET_ID and GOOGLE_APPLICATION_CREDENTIALS reflected. We’re using the same service account and key as before, although by convention, we’re making it to a mount point of /var/secrets/google here. As with Docker locally and *unlike* with COS, we need to explicitly expose port 8080. Otherwise, this should look very consistent to the other approaches.

Then:

kubectl apply --filename=deployment.yaml

Instead of punching a hole through a firewall, let’s grab one of the cluster’s underlying node names, identify the node port that the service has been created on and port-forward to it. We’ll preserve the mapping of the node port to the local port so you may continue to use 8080 elsewhere:

NODE_HOST=$(\
kubectl get nodes \
--output=jsonpath="{.items[0].metadata.name}")
NODE_PORT=$(\
kubectl get services/datastore \
--namespace=datastore \
--output=jsonpath="{.spec.ports[0].nodePort}")
echo ${NODE_PORT}gcloud compute ssh ${NODE_HOST} \
--ssh-flag="-L ${NODE_PORT}:localhost:${NODE_PORT}" \
--project=${PROJECT}

Then, while that ssh session is running, from your local machine:

curl http://localhost:${NODE_PORT}

NB You’ll need to recreate the variable or type the value of ${NODE_PORT} as it won’t be set in the separate ssh session.

And, you should continue to see output of the last 10 “visits” — corresponding to Datastore entities created — returned to you.

Conclusion

Containers three ways but primarily showing how to take a common need with containers that require shared volumes and/or service account keys and deploying a single image locally, to a VM running Google’s Container-Optimized OS and to Kubernetes. Along the way, I tried to reflect the consistency of all these approaches.

Feedback always sought!

Tidy Up!

When you’re finished with the image, please delete it:

gcloud compute instances delete ${INSTANCE} \
--project=${PROJECT} \
--quiet

When you’re finished with the template and instance group, please delete them:

gcloud compute instance-groups managed delete ${INSTANCE}-group \
--region=${REGION} \
--project=${PROJECT}
gcloud compute instance-templates delete ${INSTANCE}-template \
--region=${REGION} \
--project=${PROJECT}

If you’re done with the project entirely, you may delete the VM and the Datastore data with the irrevocable:

gcloud projects delete ${PROJECT} --quiet

That’s it!!

--

--