Kubernetes: Cron Jobs
Sometimes your work isn’t transactional. Instead of waiting for a user to click a button and have systems light up we sometimes need to respond to a specific time and take actions from there. Cron Jobs are old hat for most middle tier and backend developers needing to run a job on a schedule. However if you are using a Kubernetes Cluster you might wonder how hard it is to set up some faux cron job to handle time based actions.
In this article we are going to look at how to create cron jobs within a Kubernetes Cluster. We will then create an example to see it in action along with understand some of the various ways that you can create Cron Jobs and have them run. Finally we will look at some of the best practices so you don’t get tripped up.
If you haven’t gone through or even read the first part of this series you might be lost, have questions where the code is, or what was done previously. Remember this assumes you’re using GCP and GKE. I will always provide the code and how to test the code is working as intended.
This is the obligatory step one Kubernetes post. If you’re interested in Kubernetes you’ve probably read 100 of these…medium.com
Creating a Kubernetes Cluster Cron Job
Creating a Cron Job in Kubernetes is very similar to creating a Deployment. If you’ve been reading through the series you know that a Deployment includes some metadata about the Deployment along with metadata around the Pod that you will generate and finally some information on how to scale and manage the Pod. Cron Jobs are similar in that they include basic metadata about the resource, metadata around the Pod the Cron Job will spin up, and some information on how to manage the spinning up of the Pod.
You can see a fully built out CronJob yaml file below.
kind: CronJob # it is a Cron Job
name: endpoints-cronjob # name of the CronJob
schedule: "* * * * *" # run every minute
startingDeadlineSeconds: 10 # if a job hasn't starting in this many seconds, skip
concurrencyPolicy: Allow # either allow|forbid|replace
successfulJobsHistoryLimit: 3 # how many completed jobs should be kept
failedJobsHistoryLimit: 1 # how many failed jobs should be kept
- name: cron-container-cronjob
# environment variables for the Pod
- name: GCLOUD_PROJECT
# endpoint to hit by cron job
- name: FOREIGN_SERVICE
- name: NODE_ENV
- containerPort: 80
I want to quickly get into seeing a Kubernetes Cluster Cron Job in Action, however if you are looking for more details and best practices around the Cron Job specific yaml parameters then you can find that information below.
Kubernetes Cluster Cron Job In Action
To see this Cron Job in action is fairly simple. As always I’ve created a working sample that you can run in your Google Cloud Platform project. This example creates a Kubernetes Cluster, creates two containers, deploys one of the containers into the Kubernetes Cluster, and then creates a cron job that will startup and destroy the second container. The cron job container will just hit the main container every minute, adding to a counter within the container. I even setup an endpoint for you to see the counter increment as the cron job is run. To get started go to your Google Cloud Shell Console and start by running the following commands to setup your Kubernetes Cluster and deploy the first container.
$ git clone https://github.com/jonbcampos/kubernetes-series.git
$ cd ~/kubernetes-series/cron/scripts
$ sh startup.sh
$ sh deploy.sh
$ sh check-endpoint.sh endpoints
check-endpoints.sh script is complete you will have an IP Address for your service that you can hit from your browser.
Now we need to actually deploy your cron job to your Kubernetes Cluster. This is really simple with the same command that you would use to deploy any other resource. You can see the script I’ve setup below and follow the link for the actual code.
Now be quick as the magic has started. What is happening now is every minute your Kubernetes Cluster is starting up a new Pod with the cron job container. You can see this is ready in your GCP Kubernetes Workloads view.
If you drill into the
endpoints-cronjob and its events you will see the
endpoints-cronjob being created.
After the first minute has passed that the
endpoints-cronjob events shows the cron job being created and run successfully. When completed you’ll see the events change to include the completed event.
If you want to see the data counter increment then you can go to your service’s IP Address and view the
http://[Your Service IP Address]/data.
If we let the cron job keep running for a while we will just see the events and data continue to grow.
And the service’s data view showing even more
We could let this keep running forever but at this point I think we can agree that we’ve proved the point. We have now shown a NodeJs application that is run on a specific interval by our Kubernetes Cluster.
Best Practices For Kubernetes Cluster Cron Jobs
Thinking back to the yaml file provided, there are some points of interest to take note of. Below are some properties that are specific to the CronJob Kubernetes kind along with some explanations of how to utilize these features to their fullest.
spec.schedule: This is where you set the cron schedule for your Pod creation using common cron notation.
spec.startingDeadlineSeconds: The amount of time that Kubernetes can miss and still start a job. If Kubernetes missed too many job starts (100) then Kubernetes logs an error and doesn’t start any future jobs.
spec.concurrencyPolicy: When specifying the concurrency policy you have three options to choose from:
Replace. If you select
allow then you are allowing multiple jobs to run at the same time.
Forbid stops multiple cron jobs from running at the same time. And
replace will replace a currently running job with the new job.
spec.suspend: This optional feature is really good to suspend a cron job without having to delete the cron job. To enable just set the value to
spec.successfulJobsHistoryLimit: How many successful jobs should be kept. By default only 3 are kept.
spec.failedJobsHistoryLimit: How many failed jobs should be kept. By default only 1 is kept.
spec.jobTemplate: This is the job template for the Job to run when the schedule is hit. I will be following up soon with how to write batch-jobs. One post at a time. :)
The best practice to remember is that Kubernetes Cron Jobs aren’t perfect. Sometimes the job doesn’t launch exactly at the moment you want it to. Or two jobs may be running over one another. To not get bitten by this vagueness it is important to make your jobs idempotent. This means, to code your jobs in a way that the order doesn’t matter. If you’ve worked through map/reduce problems then this concept isn’t new. If you haven’t here is a quick example using math.
// No matter the order you always end up with the same value
1 + 4 + 5 + 9 + 3 + 2 + 1 = 25 // = idempotent
// Depending on the order (due to parenthesis) you end up with different values
(1 + 4) * 5 + (9 - 3) / 2 + 1 = 29 // != idempotent
1 + (4 * 5) + 9 - (3 / 2) + 1 = 29.5 // != idempotent
1 + 4 * (5 + 9 - 3) / (2 + 1) = 15.667 // != idempotent
Before you leave make sure to cleanup your project so you aren’t charged for the VMs that you’re using to run your cluster. Return to the Cloud Shell and run the teardown script to cleanup your project. This will delete your cluster and the containers that we’ve built.
$ cd ~/kubernetes-series/cron/scripts
$ sh teardown.sh
I was excited to see Kubernetes support cron jobs. With this addition you can build a lot more features into your Kubernetes Cluster without having to bake in timers into your containers.
Other Posts In This Series
When building an application it is common that you’ll need to interact with external services to complete your business…medium.com
I remember when I was first getting into Kubernetes. Everything was new and shiny and about scale. As I continued…medium.com
With Pod Autoscaling your Kubernetes Cluster can monitor the load of your existing Pods and determine if we need more…medium.com
In case there was any question about this feature, I am writing about it specifically to state that this is not an…itnext.io
Jonathan Campos is an avid developer and fan of learning new things. I believe that we should always keep learning and growing and failing. I am always a supporter of the development community and always willing to help. So if you have questions or comments on this story please ad them below. Connect with me on LinkedIn or Twitter and mention this story.