Batch Workloads in Kubernetes

80% preparing data, 20% complaining about preparing data!

Mukund Krishnan
Analytics Vidhya

--

It is no secret that the digital revolution has been all about collecting data and using that to grow the client base. Big data and analytics are at the core of making intelligent decisions, and clean data is the key to derive valuable insight. The first step towards gaining insight is gathering and processing data, and Batch jobs make that happen.

Everything that runs on Kubernetes is a workload. A workload can be a single component or several. There are several built-in workload resources available. Kubernetes provides two workload resources to create batch transactions — a Job object and a CronJob object. A Job object creates one or more Pods and will try to retry the execution until a specified number of them successfully terminate. A CronObject like a crontab runs periodically on a given cron schedule. There are three main types of jobs in the K8s ecosystem. In simpler terms, CronJobs recur according to schedule, and Jobs are one-off tasks. Since everything in Kubernetes runs on a pod, Jobs and CronJobs are executed within pods as well.

Simple Job

Use this pattern when some script needs to be run only once. The job does its work using Pods.

# save this a simple-job.yaml
apiVersion

--

--

Mukund Krishnan
Analytics Vidhya

Husband, father, engineer, reader, dreamer, tinkerer.