Increasing PostgreSQL Cloud SQL Connections w/ PgBouncer & Kubernetes Engine
I recently read a fantastic 3 part article from FutureTech Industries about using Helm, Kubernetes, PgBouncer, and CloudSQL to drastically increase the connections a Postgresql DB can handle. Although it was very informative, as a person who was not extremely versed in Helm and Kubernetes, I needed a more thorough tutorial for the installation of Helm, Tiller, and setting up the env. I thought that a step by step tutorial based on what I learned would be helpful for the next “me” to stumble along.
PostgreSQL is an awesome database that provides a huge value to both startups and large corporations alike. Unfortunately; it does not do extremely well with a large number of connections, which can often happen when deploying a large REST API with multi-threading enabled. Google Cloud SQL now offers highly available Postgresql DBs that perform quite well but their connection limits can sometimes be too low to allow you to get the most out of your instance. Without a connection pooler, the cheaper CloudSQL instances can quickly become maxed out with connections and halt your scaling unless you spend a pretty penny on the larger instances.
That is where PgBouncer comes in, it is a simple PostgreSQL connection pool that allows for several thousand connections at a time. Using Kubernetes Engine to run a Helm Chart w/ PgBouncer based on the great article from futuretech-industries, we were able to set up an easily deployable system to get the most out of our CloudSQL DBs without breaking the bank.
Setup Cloud SQL
You will need a Cloud SQL instance that has private-ip enabled. This is well documented and can be done entirely over the Cloud Console without the CLI. Follow the links below for detailed instructions!
- Creating Cloud SQL Instance (Be sure to place it in the same region as your Kubernetes Cluster!)
- Enabling Private IP
- Create a DB User for PgBouncer
Create Kubernetes Cluster
This step is also straight forward and can also be done without CLI. Under Node Pools turn OFF auto-scaling if enabled and add a label role: pgbouncer
and create your cluster
- Creating a Cluster
- Enable VPC-native for Private IPs:
This will allow your cluster to communicate with your Cloud SQL database using private-ips and without communicating over the public internet (More Secure!).
Setup and Authenticate GCloud SDK Command Line
You will need this to retrieve the credentials to deploy to your kubernetes cluster.
Install Helm Locally
Helm is a package manager for Kubernetes that has packages called “charts” that contain templates to deploy workloads onto the Kubernetes platform. Helm uses a service called “Tiller” which is normally installed on the cluster. If you do any research you are bound to read about Tiller’s security issue if not installed securely in production. I am going to show you how to install Tiller on your local computer to avoid this security issue altogether. The following is based on this great tutorial by Rimas Mocevicius about how to install “Tillerless Helm”. This way you get all the benefits from tiller without the security issues.
The installation of Helm is pretty straight forward and there are many ways to install. I prefer install helm scripted like this:
$ curl https://raw.githubusercontent.com/helm/helm/master/scripts/get > get_helm.sh
$ chmod 700 get_helm.sh
$ ./get_helm.sh
Helm Tiller Plugin Local Install
You will need to initialize helm locally after installation. The --client-only
flag ensure that Tiller is not install on your cluster
helm init --client-only
Then you will need to snag your Kubernetes cluster credentials from google cloud platform. This tells Tiller which cluster to connect to and authenticates it:
gcloud beta container clusters get-credentials <cluster-name> --region <cluster-region> --project <project-id>
… And install the Tiller Plugin to run Tiller locally:
$ helm plugin install https://github.com/rimusz/helm-tiller
Installed plugin: tiller
Start Tiller server locally:
You include any namespace you want or leave it empty and it will default to kube-system
$ helm tiller start <namespace>
If it is not working try this command to tell helm what port Tiller is running on:
export HELM_HOST=localhost:44134
Stopping Tiller after finished:
helm tiller stop
Setup your chart for installation
You are now ready to install a chart on the cluster but first you need a chart! I have modified a chart from futuretech-industries that will allow you to expose you cluster locally using an internal load balancer. Google’s internal load balancer will allow you to share your cluster with other resources in your VPC using a private-ip without ever exposing it with a public-ip. Other than that modification it is the same as the original chart here.
$ helm repo add pgb 'https://raw.githubusercontent.com/fopark/charts/master/'
$ helm repo update
Chart Configuration
Next you will need to configure the values for your chart to connect and authenticate to your database. Create a new directory on your local machine
mkdir helm_chart_values
Create a file called pgbouncer-values.yaml
and save it in the above directory to start your chart configuration so you can tweak and redeploy later on. Paste and Edit the following into the yaml file:
# The database credentialsusername: ""password: ""host: x.x.x.x# The number of instances to run.replicasCount: 2# The maximum number of incoming client requests per instance.maxClientConnections: 5000# The maximum number of upstream requests per instance.poolSize: 20# Resources for the pods. For PgBouncer it's critical that these match to provide guaranteed QoS.# The chart therefore doesn't allow setting requests and limits separately.cpu: 150mmemory: 200Mi# The custom node selector. Allows you to pick a specific Node Pools to deploy your workload onto. # By default this is off; see below.nodeSelector:role: pgbouncer# Set this to LoadBalancer to make PgBouncer available outside the cluster or CluserIP for only local access.serviceType: LoadBalancer# If you want access to your cluster but only in your internal VPC network leave the below setting. Otherwise just delete it.loadBalancerType: "Internal"# If you use RazorSQL or another tool for DB inspection you might have to place this config into pgbouncer.ini with the below parameter. injectConfig: { ignore_startup_parameters: extra_float_digits }
There are many more values that the great folks at Future Tech Industries setup and a few that I did. To see all available options enter the following command:
helm inspect values pgb/pgbouncer
Save the file and deploy your cluster:
$ helm install --name my-pgbouncer --namespace my-pgbouncer-namespace pgb/pgbouncer -f /path/to/pgbouncer-values.yaml
The -f flag will overwrite the default values with the values inside your pgbouncer-values.yaml file
Check to see if your values updated on the Cloud Console:
If you update the pgbouncer-values.yaml and want to redeploy, use the following:
helm upgrade <workload-name> pgb/pgbouncer -f helm_chart_values/pgbouncer-values.yaml
There you have it! You should have a pgbouncer cluster running w/ CloudSQL! Stay tuned for my next article where I will deploy a Python Flask API on Google App Engine Flexible Environment using pgbouncer and Cloud SQL.
I want to note that I am just mainly putting this together from several sources and not claiming the be the content creator on most of the topics I discussed above. I just went through a very long night setting this up just the way I wanted it and thought I would contribute like the awesome folks who put the articles I am listing below. Please take a look at the articles as they go into much more detail on their respective topics!
Sources: