Connecting Cloud SQL - Kubernetes Sidecar

Gabe Weiss
Google Cloud - Community
13 min readFeb 7, 2020

Hi friends!

This blog is a bit longer than my usual how-to’s, because we’re dealing with a lot of moving parts. We’ve got Cloud SQL, the Cloud SQL Proxy, Docker and containers, Google Container Registry, Kubernetes, and managing secrets like your db credentials in the Cloud so you don’t have to have them exposed in any configuration files or environment variables.

Don’t be intimidated though! I am coming at this from the perspective of a beginner so I’m explaining things in a lot of detail. If you’re experienced in a lot of this, no worries, I’ve broken out a lot into separate blog posts I link along the way so you should be able to breeze through to the core of the post quickly.

If you’re not experienced at all, welcome to the wide world of Kubernetes and Cloud SQL! While you won’t come out of this an expert (I am really far from a Kubernetes expert), hopefully you’ll have a handle on the basics, and how to get started.

So, what’s a sidecar? We’re not talking about the motorcycle sidecar, but this pattern is named because of it.

The Kubernetes sidecar pattern is where you attach a supporting container to your application’s container in order to make your life easier in some way. In our case, this refers to attaching a container running the Cloud SQL Proxy along-side your application. There are a lot of benefits to using the proxy, and it’s our recommended best practice for connecting to Google Cloud SQL.

Yes, there are other ways to do this than the sidecar pattern. In a followup blog, I’ll go deeper into the pros and cons for different patterns.

For this blog post, I wrote a snazzy (to me) app that fakes database data for MySQL. You can find my script on GitHub, and a blog around it and its usage here.

If you want to know a bit more background and context around connectivity to Cloud SQL in general, check out my intro to connectivity blog post. That post also has links to more step-by-step posts around different use-cases and methods, as well as why you might want to pick one method over another.

Prerequisites: I’m assuming that you’ve already got your own Google Cloud Platform (GCP) project with billing set up. If you don’t, head here to get started with a project, or here to set up billing for the project. If you intend on running any of this locally (codelab runs things in the console shell on GCP, which is Linux so has it installed by default) you’ll need Docker installed and available on your machine.

The App

The codelab walks through standing up an app that allows you to create and store memes. It has a web UI served up by flask and SQLAlchemy.

The app I wrote is significantly simpler. I just wanted an app that showed off one thing: Connecting to Cloud SQL and dumping some fake people data into the database.

Everything of course applies if you’ve written your own application as well.

Containerizing

One thing I had trouble with in writing this blog post, was finding a nice simple path from “I have an application” to “It’s running in Kubernetes”. Well now we have an application to run.

To run in Kubernetes, we need a container (or containers in our case). If you already know how to build a container, yay! If not, I wrote a separate blog on what I did to create a container out of my application script, and broke down the container creation that’s done in the codelab side by side to help navigate what you would need to do to create a container out of whatever you want to run in Kubernetes along with the proxy sidecar.

Either way, go through that blog post to get a container, or use one of your own. Now, the core of the story: Deploying the app with the proxy sidecar!

Scaling the app

In this section:

  1. Install, and authorize gcloud (Google Cloud SDK command line tool) with Google Container Registry
  2. Prepare, and upload container into Google Container Registry
  3. Create Google Kubernetes Engine (GKE) cluster
  4. Replace ENV vars in Dockerfile with Kubernetes Secrets
  5. Breaking down Kubernetes yaml deployment files
  6. Deploy application to Kubernetes cluster
  7. Scale Kubernetes up and down
  8. Troubleshooting

Step one, let’s get our app container into the Cloud. GCR (Google Container Registry) is a secure way to store containers to make them accessible from the Cloud. We’ll be using GKE (Google Kubernetes Engine) to scale up. We need to get docker configured to use GCR. I’m going to do this with gcloud. If you’re not familiar with the Google Cloud SDK, you should be! Command line interface for all things Cloud Platform. The installation docs can be found here, and I did a detailed walk-through of installing and configuring it in a previous blog post here. Go do that and come on back.

Next up, run the command to auth gcloud with GCR: gcloud auth configure-docker. You’ll see output that looks like this:

The following settings will be added to your Docker config file
located at [/Users/<user>/.docker/config.json]:
{
"credHelpers": {
"gcr.io": "gcloud",
"us.gcr.io": "gcloud",
"eu.gcr.io": "gcloud",
"asia.gcr.io": "gcloud",
"staging-k8s.gcr.io": "gcloud",
"marketplace.gcr.io": "gcloud"
}
}
Do you want to continue (Y/n)?

Note the path to your config.json could be different based on where/how you installed Docker, but the rest should be the same. Go ahead and hit “Y”.

Before we upload our container, we need to edit and rebuild it. Remember, we don’t want the password written anywhere in the Cloud, so before we upload the container, we want to get rid of any trace of our password in plaintext. Delete the ENV lines in the Dockerfile.

Run the following command to tag your container with a GCR uri. If you named your container different from randomizer earlier, substitute it for randomizer in the commands below, and substitute whatever project your using with gcloud for the $PROJECT_ID:

docker tag randomizer gcr.io/$PROJECT_ID/randomizer

If you are fiddling around with rebuilding the container multiple times, don’t forget the tag step. If you don’t do it, then it won’t upload the newer version of the container into GCR.

Push your container to your GCR repo with the following:

docker push gcr.io/$PROJECT_ID/randomizer

Once that completes, your container is all set and ready to be pulled by Kubernetes. Of course, GCR isn’t the only way to get your container up into the Cloud to get pulled in. You could use Docker Hub, setup your own provider. GCR just makes it easy from GKE.

Next up, let’s create our Kubernetes cluster in GKE:

gcloud container clusters create randomizer-cluster --zone us-central1-f --machine-type=n1-standard-1 --max-nodes=10 --min-nodes=1

Be sure to pick a zone close to you and/or where you want the application running. You can see a list of the zones here. Machine type maps to how powerful the machines are that spin up in your cluster. You can go here to see more details about the different machine types. n1-standard-1 is a single vCPU with 3.75GB of RAM, so plenty for our purposes. Finally, the min/max nodes sets some limits on the scaling of the cluster.

Go ahead and run the command, then verify that the cluster was created ok by running gcloud container clusters list.

We need to be sure that kubectl (command line tool which is how we’re going to be interacting with our k8s cluster) is authenticated with GKE. Run this to authenticate:

gcloud container clusters get-credentials randomizer-cluster --zone us-central1-f

Be sure that you alter the name of the cluster, and the zone to match what you used when you setup the cluster. You should see:

kubeconfig entry generated for randomizer-cluster

Now to obfuscate our database credentials. There are several ways to do this, I’m not going to try to cover them here. My old teammate wrote a great two part blog post a couple years ago covering some background, and basics around this. Since then there’s even more ways to do it. I’ll probably write or find a follow-up post covering them. For now, we’re going to be using Kubernetes secrets to do this.

There’s two sets of credentials we need for our container: the database user credentials, as well as the service account credentials. The Cloud SQL Proxy uses the service account to connect to our Cloud SQL instance.

We can turn the service account json file into a Kubernetes secret, which can be passed into the container and used by the proxy.

kubectl create secret generic cloudsql-instance-credentials --from-file=sql_credentials.json=<service_account_json_file>

And create our db credentials:

kubectl create secret generic cloudsql-db-credentials --from-literal=username=[DB_USER] --from-literal=password=[DB_PASS] --from-literal=dbname=[DB_NAME]

We can use those secrets to set environment variables as part of our deployment that then get picked up by our script, and away we go!

The deployment yaml for my app looks very similar to the codelab one, so I’ll just walk through mine and break down what’s going on (note that the full file is in the repo, I’m just pulling pertinent parts out).

The containers section defines which containers we pull into each node of the k8s cluster. So for ours, there’s two containers we’re using. The one we created with our app in it, and then the SQL Proxy container that’s already setup for us in GCR.

In the randomizer container section you can see how we’re using the image URL from GCR that we uploaded earlier. Remember to change the <PROJECT_ID> value to your project id. The env section defines the environment variables that will be set on the container when it gets deployed. In our case, it’s the database credentials from the secret we created. Remember, we aren’t using the SQL_HOST variable because we’re using the proxy, so the script code will default to using localhost.

      containers:
- name: randomizer
image: gcr.io/[PROJECT_ID]/randomizer
# Set env variables used for database connection
env:
- name: DB_USER
valueFrom:
secretKeyRef:
name: cloudsql-db-credentials
key: username
- name: DB_PASS
valueFrom:
secretKeyRef:
name: cloudsql-db-credentials
key: password
- name: DB_NAME
valueFrom:
secretKeyRef:
name: cloudsql-db-credentials
key: dbname

In the cloudsql-proxy container, the image tag tells Kubernetes where to grab it. As of the writing of this post, I’ve set the latest version of the image (1.16). Check here for the list of containers and see what the latest version is. Update the line image: gcr.io/cloudsql-docker/gce-proxy:1.16 replacing the 1.16 with the latest version. For the command to run, be sure to change <INSTANCE_CONNECTION_NAME> with the connection name for your Cloud SQL instance. If you’re looking at the Overview page of your Cloud SQL instance, you’ll see it under the Connect to this instance header. It’ll look like <project id:region:instance id>. Because the proxy wants a json file as the service account bearer token, we setup a mount with our service account secret we created earlier, and pass that path in to the command to startup the proxy.

- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.16
command: ["/cloud_sql_proxy",
"-instances=<INSTANCE_CONNECTION_NAME>=tcp:3306",
"-credential_file=/secrets/cloudsql/sql_credentials.json"]
volumeMounts:
- name: my-secrets-volume
mountPath: /secrets/cloudsql
readOnly: true
volumes:
- name: my-secrets-volume
secret:
secretName: cloudsql-instance-credentials

Before we deploy our application, I want to describe what’s about to happen as it runs, which means I need to explain Kubernetes a little on how it runs the container. As Kubernetes starts up a pod, your container is going to run the application, which will generate 1,000 employee records, and then quit. Kubernetes monitors the health of the pods to know if it needs to restart them. One of the ways it does this is called a liveness probe. It checks if the container went idle or not, and if it did, restarts it. This is an oversimplification, it’s more complicated than that and if you want to know more, the docs are here.

What will happen with us then is: The application will run, the liveness probe will see that the application has quit and the container is idle, so Kubernetes will restart the container.

This means the container will restart, and the script will run again! So it’ll then add another 1,000 employees, rinse and repeat.

So long-story short (too late), let’s get this up on its feet! The command to deploy is:

kubectl create -f randomizer_deployment.yaml

This launches a single replica of our application into GKE. Replica, nodes, pods, etc are all terms that you, if you want to dive deeper into Kubernetes, will want to learn. Short version is, a node represents a machine, which runs a pod, which represents the container(s). These nodes are controlled by the deployment we just created, which defined how the nodes behave.

Almost immediately, you should be able to run:

kubectl get pods

And you’ll see some output that looks like:

$kubectl get podsNAME                         READY   STATUS    RESTARTS   AGE
randomizer-7b7845c7d-8vjgq 2/2 Running 0 3s

It’ll likely start off saying ContainerConstruction for a bit first as it grabs the containers from GCR. If it says Running it means we’re up and going! You should be able to connect to your database (I just use mysql in the shell), switch into the database you defined in your secret we created for DB_NAME, and run SELECT * FROM employee;. As I mentioned, because of how Kubernetes works, you’ll see the number of rows keep going up over time.

Now, that’s all well and good, but we’re not really any faster at this point than just running that script locally. That’s because we’re only using a single replica. So what happens now? Let’s turn up the power a bit. To stop the Kubernetes cluster from running the script, the easiest way I’ve found (there are probably better ways to do this, again, I’m not a Kubernetes master) is to scale the cluster down to 0. You can do this by running:

kubectl scale --replicas=0 -f randomizer_deployment.yaml

This tells the cluster to run zero replicas, which terminates the existing running node. Running kubectl get pods now, you’ll either see your node with the status Terminating, or No resources found. You probably already see where this is going. To scale our application up, we can run:

kubectl scale --replicas=20 -f randomizer_deployment.yaml

Now you’ll see something like:

$ kubectl get podsNAME                        READY  STATUS             RESTARTS  AGE
randomizer-7b7845c7d-2w6vx 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-2xnmp 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-5g6cc 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-8rc94 2/2 Running 0 4s
randomizer-7b7845c7d-cmv9m 2/2 Running 0 4s
randomizer-7b7845c7d-cqzl8 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-ddskl 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-fbxxc 2/2 Running 0 4s
randomizer-7b7845c7d-hhfm4 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-hxnnd 2/2 Running 0 4s
randomizer-7b7845c7d-jbm2c 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-jxfhk 2/2 Running 0 4s
randomizer-7b7845c7d-m6zwz 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-mhgxn 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-sgvd5 2/2 Running 0 4s
randomizer-7b7845c7d-ss5fm 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-tllg8 0/2 ContainerCreating 0 4s
randomizer-7b7845c7d-xnm96 2/2 Running 0 4s
randomizer-7b7845c7d-xtwd6 2/2 Running 0 4s
randomizer-7b7845c7d-z2528 0/2 ContainerCreating 0 4s

Here we are, running 20 instances of our application all at once, happily doing their thing, getting marked idle by GKE, being restarted, and continuing to pump data into our database. Now if you connect to your db and start running that SELECT statement a few times, you’ll see our number of rows going up quite a bit faster than before.

One bit of troubleshooting…if the status of the pod says Error it generally means something crashed or failed in the container. Debugging things in Kubernetes can be…difficult. The easiest way I’ve found so far, is to know that anything printed to stdout in a container gets logged, and you can retrieve that log with kubectl. So let’s say that one of my pods above: randomizer-7b7845c7d-ss5fm has the status Error. I can find out what’s going on with this command:

kubectl logs randomizer-7b7845c7d-ss5fm randomizer

Note: If you’ve already scaled back down to zero replicas, this won’t work. It has to be on a running pod. I have to specify randomizer again in that command only because we’re running two containers in each pod, so when I ask for a node’s logs, I have to specify which container’s logs I mean, because they’re kept separate. Anything that went to stdout/stderr as part of the running of the container will get dumped when I run logs. In the case of prepping this blog, I had a Couldn’t connect to the MySQL instance that was baffling to me for a long time. Realized finally that it was because the Cloud SQL Proxy hadn’t started yet in its container before my script was firing. That’s why (and you’ll see it in my script) I put in the exponential backoff retries I added to connecting to the database.

Wrap-up

So there you have it! Using the Kubernetes sidecar pattern to run the Cloud SQL Proxy as the connection point for your horizontally scaled application. Hopefully now, between the two examples of the gmemegen codelab and my data faker script you have a solid foundation to get your own application up and running!

To clean up, you need to delete the cluster:

gcloud container clusters delete randomizer-cluster

And then, if you really want to clean house completely, delete the container we uploaded from GCR:

gcloud container images delete gcr.io/$PROJECT_ID/randomizer

But hang on, if you uploaded the container multiple times, you’ve got some orphaned images probably. Two options: 1) Go into the console here (don’t worry, it’ll say can’t find the URL, just click on images on the left), and delete them by hand. or 2) run:

gcloud container images list-tags gcr.io/$PROJECT_ID/randomizer --filter='-tags:*' --format='get(digest)' --limit=100

And then for each sha value, run the images delete command like so:

gcloud container images delete gcr.io/$PROJECT_ID/randomizer@sha256:cd53812d71ed6e264788cc60d3413774f8553a877ef78b3eacb5c3334c653fc8

That should be it, and your project is now reset and clean of what we did in this blog!

As a final note here, if you wanted to run something like this as a k8s Job instead of as a continuous cluster, note that there’s a problem: The proxy sidecar won’t exit because it’s happily running, so the Job won’t ever complete. There’s a couple ways to solve this, specifically this Stack Overflow question has some good discussion and a solution using the SYS_PTRACE security context to kill the proxy when the rest of the application finishes the work.

Run into any problems? Please let me know! Respond in comments below, or reach out to me on Twitter. My DMs are open!

--

--

Gabe Weiss
Google Cloud - Community

Husband, father, actor, sword fighter, musician, gamer, developer advocate at Google. Making things that talk to the Cloud. Pronouns: He/Him