GCP — Deploying highly available databases with GKE

GKE for Databases

Shashank Agarwal
Google Cloud - Community
2 min readApr 7, 2021

--

Introduction

Containerization of software applications has transformed ease of packaging software applications. On top of it Google Kubernetes Engine (GKE) has made management of those containers far more easier. In this post we will look at how stateful applications such as database(s) can take advantage of the same technology.

Architecture Diagram

Databases in GKE

Identify database binary
Above diagram depicts an highly available deployment for any* relational database (like MySQL, Postgresql, MSSQL-Linux) on GKE. You will be able to find ready to use container images for all of these in Docker Hub.

Creating GKE Cluster
GKE cluster hosting the database should have a bi-zonal node pool in a single region. This is to ensure that HA setup which is going to be based on Regional Persistent Disks works.
GKE cluster should have nodes with spare capacity so that upon failure all the POD(s) could move to HA zone safely.

Provision storage for database
GKE offers Persistent Volume Claim (PVC) to store data. PVC can be created using either PD or PD SSD. Further, PVC can be either zonal or regional (bi-zonal). In this setup you should be using Regional PD/PD-SSD for HA purposes. Regional PD/PD-SSD automatically maintains two copies of data (disk 01 and disk02 in diagram).

GKE Service
This is a facade which allows client applications to communicate to the database. While there can be various low level implementations, commonly an Internal Load Balancer (ILB) is generated behind the scenes for a GKE Service. Upon failover, gke service automatically re-configures to send the traffic on new gke pod.

Reference Implementation

Deploying highly available PostgreSQL with GKE
Read above article (written by me) for detailed instructions on this setup using postgresql. However, same can be adopted to use any* common relational database.

--

--