Kubernetes: Customized Cron job with GCP Pub/sub, Cloud SQL, and Batch Jobs.

Avinash Shrivastava
6 min readSep 19, 2018

--

The POC will help you to achieve a flow wherein messages are being published to a message queue (Cloud Pub/Sub) every 15 mins based on the data fetched from Database (Cloud SQL). The messages are then consumed by a subscriber (Kubernetes batch Job).

High-level Architecture

There are mainly 3 components to the flow.

Scheduler: Custom Cron

What is it?: This is a Crontab running inside a Kubernetes POD under a Deployment which triggers the Invoker and the Subscriber component of the flow, every 15 mins.

How?: The POD uses a Kubernetes Python client to first delete the Subscriber Kubernetes Jobs and Invoker Deployment if they exist and subsequently create a new set.

Authentication?: The POD is authenticated for creating and deleting objects in the cluster with the help of Kube-config file injected by a ConfigMap.

Invoker: Message Publisher

What is it?: This is a Kubernetes Deployment which publishes messages to Cloud Pub/sub Topic.

How?: Uses GCP Pub/Sub Python client which first fetches some strings from Cloud SQL via Mysql Python Client. The POD connects to Database with the help of Cloud SQL Proxy that runs as a Proxy container. The messages are then Published to Cloud Pub/Sub.

Authentication?: The authentication is done with the help of Service Account key injected by Kubernetes Secret as an environment variable. The Python client makes use of the variable GOOGLE_APPLICATION_CREDENTIALS, which contains the key path.

Subscriber: Message Consumer

What is it?: This is a Kubernetes Job (Batch Job) which consumes and acknowledge messages from a Cloud Pub/Sub Subscription. The Job, when triggered, executes 3 identical PODs in parallel, which starts consuming messages from Pub/Sub.

How?: Uses a Python client to consume messages from Cloud Pub/Sub. The listener client is set to listen for 5 mins. The Job is marked as completed when the listener exits and is terminated after 15 mins.

Authentication?: The authentication is done with the help of Service Account key injected by Kubernetes Secret as an environment variable. The Python client makes use of the variable GOOGLE_APPLICATION_CREDENTIALS, which contains the key path.

GCP Environment setup

  • GCP Project created with a user having owner rights.

After creating a project, list it from the Cloud Shell and store in a variable.

export PROJECT_ID=`gcloud config get-value project`

Run below command to create a Kubernetes cluster.

gcloud container clusters create cyb-cluster --zone us-central1-a \--num-nodes 1 --machine-type n1-standard-2 --cluster-version 1.10.7-gke.6

Run below commands from Cloud Shell to create a topic and subscription.

gcloud pubsub topics create cyb-topicgcloud pubsub subscriptions create --topic cyb-topic cyb-sub \--ack-deadline=600

Run below command to create Cloud SQL instance from Cloud Shell.

gcloud sql instances create cyb-mysql --tier=db-n1-standard-2  \--region=asia-northeast1

Create a Service Account

  • Create a new service account.
gcloud iam service-accounts create cyb-sa --display-name “cyb-sa”

List the newly created Service Account to get the SA email id and store it in a variable.

export SA_EMAIL=`gcloud iam service-accounts list | grep cyb-sa | awk '{print$2}'`
  • Assign Roles to Service Account.
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL"  --role "roles/cloudsql.admin"
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL" --role "roles/compute.admin"
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL" --role "roles/container.admin"
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL" --role "roles/container.clusterAdmin"
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL" --role "roles/container.developer"
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL" --role "roles/container.serviceAgent"
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL" --role "roles/pubsub.admin"
gcloud projects add-iam-policy-binding $PROJECT_ID --member \"serviceAccount:$SA_EMAIL" --role "roles/storage.admin"
  • Furnish a new key for the Service Account.
gcloud iam service-accounts keys create key.json --iam-account $SA_EMAIL

NOTE: Save this key, it will be used later on for creating a Secret.

Database setup

After creating Cloud SQL instance following the steps provided in the GCP Environment setup section, we now need to create some sample data. This data will then be referred by our Invoker component to publish messages on Pub/Sub.

  • Create Database.
gcloud sql databases create cybdb --instance=cyb-mysql
  • Create a database user.
gcloud sql users create cybuser --host=% --instance=cyb-mysql \--password=cybpass

Run below command from Cloud Shell. Enter the password “cybpass” when prompted.

gcloud sql connect cyb-mysql —-user=cybuser
  • Create sample data.

1. Switch to the cyb database.

use cybdb;

2. Create a sample table.

CREATE TABLE `cybdb`.`employee` (`code` INT NOT NULL ,`name` VARCHAR(45) NOT NULL ,`city` VARCHAR(45) NOT NULL ,PRIMARY KEY (`code`));

3. Insert data into the table.

INSERT INTO employee (code,name,city) VALUES(1,”good”,”Pune”);INSERT INTO employee (code,name,city) VALUES(2,”bad”,”Mumbai”);INSERT INTO employee (code,name,city) VALUES(3,”ugly”,”Delhi”);quit;

Kubernetes Setup

  • Connect to Kubernetes cluster.
gcloud container clusters get-credentials cyb-cluster \--zone us-central1-a --project $PROJECT_ID
git clone https://github.com/avish1990/GCP-GKE-CustomCronPOD-SQL-PubSub-K8Jobs.git

Get your Cloud SQL instance connection name and update your Invoker component YAML file with it.

Use below commands to replace the “CLOUD_SQL_INSTANCE_NAME” field with SQL Connection name in invoker.yaml.

export CONNECTION_NAME=`gcloud sql instances describe cyb-mysql | grep connectionName | awk ‘{print$2}’`sed -i “s/<CLOUD_SQL_INSTANCE_NAME>/$CONNECTION_NAME/g” ~/GCP-GKE-CustomCronPOD-SQL-PubSub-K8Jobs/invoker/invoker.yaml
  • Create Secrets

You need two Secrets to enable your Kubernetes Engine application to access the data in your Cloud SQL instance:

The cloudsql-instance-credentials Secret contains the service account.

The cloudsql-db-credentials Secret provides the database user account and password.

To create these Secrets:

1. Create the cloudsql-instance-credentials Secret, using the key file you downloaded previously:

kubectl create secret generic cloudsql-instance-credentials \--from-file=credentials.json=key.json

NOTE: The key.json was created in the service account section.

2. Create the cloudsql-db-credentials Secret, using the name and password for the database user you created previously.

kubectl create secret generic cloudsql-db-credentials \--from-literal=username=cybuser \--from-literal=password=cybpass

3. Create a secret with your Project ID.

Both the Subscriber and Invoker PODs Python libraries need the GCP Project ID. Use below command to create a secret and mount as an env variable.

kubectl create secret generic gcp-project-id \--from-literal=PROJECT=$PROJECT_ID
  • Create ConfigMap

1. Kube-config

Create a ConfigMap to inject your Kube config file into your Scheduler POD. The Kube config file contains your Kubernetes Cluster info, context, and public certificate. The Scheduler POD makes use of this file to authenticate to Kubernetes cluster and then create and delete Deployment and Jobs.

This file is generated automatically in your CLoud Shell when you connect to your Kubernetes cluster and is stored at default location “~/.kube/config”.

cd ~/.kube
kubectl create configmap kube-config --from-file=config

NOTE: You can either mount this Kube-Config as a ConfigMap or embed directly into your Scheduler Docker image.

2. Invoker-config

Create a ConfigMap to inject your Invoker component Deployment YAML into your Scheduler POD. Scheduler POD uses this YAML to create and delete Invoker Deployment. Run below command.

cd ~/GCP-GKE-CustomCronPOD-SQL-PubSub-K8Jobs/invokerkubectl create configmap invoker-config --from-file=invoker.yaml

3. Subscriber-config

Create a ConfigMap to inject your Subscriber components’ Jobs YAML into our Scheduler POD. This will be used by Scheduler POD to create and Jobs. Run below command.

cd ~/GCP-GKE-CustomCronPOD-SQL-PubSub-K8Jobs/subscriberkubectl create configmap subscriber-config --from-file=subscriber.yaml

Deployment

  • Create Scheduler Deployment

Run below command to create Scheduler Deployment.

kubectl apply -f ~/GCP-GKE-CustomCronPOD-SQL-PubSub-K8Jobs/scheduler/scheduler.yaml        
  • Check PODs status.

As soon as Scheduler Deployment is triggered. Scheduler Pods are created which then triggers Invoker and Subscriber after 15 mins.

Use below command to check the PODs status.

kubectl get pods
Desired output.

The Job will complete after 5 mins and is marked as completed.

Job Completed
  • Validation

You can see your published messages in Subscriber PODs log. Use below command to check logs of every POD.

for i in `kubectl get pods -l app=cyb | grep -v NAME \| awk '{print$1}'`; do echo "--- POD-$i ---"; kubectl logs $i | grep Received; done

--

--