How Many Ways are there to Schedule Code Execution?

Michal Yanko
Dec 28, 2019 · 5 min read

In our mundane work, we often need to schedule and run periodic tasks. Running a scheduled task may seem trivial using built-in tools such as crond but sometimes these solutions are just not enough. If a task needs to run in large scale production systems we need a stable, scalable, and monitorable solution.

In this post, we will review different job scheduling solutions. We will focus on different use-cases, pros & cons of each solution and how we should choose the best solution most suitable for our needs.

TL;DR — Need to run a scheduled task? Review the decision-making chart at the end of the article.

Photo by insung yoon on Unsplash

Unix Cron Jobs

The first and most basic solution is crond. This is a time-based job scheduling utility in Unix-like operating systems. It is a built-in utility and if you want to use it you need to learn how to write a cron expression (which is also needed for most other solutions) and to update the crontab file accordingly. This solution is very easy to use, but it is typically hard to monitor the executed tasks and their outcomes. Therefore, crond is most suitable for local tasks, running on your own computer. For example, running a daily script to clean up your personal computer.

Pros

  1. Builtin in every Unix-like operating system.
  2. Easy to configure.
  3. Can trigger compute-intense tasks (as long as the machine it runs on can handle it).

Cons

  1. Hard to monitor executions and failures, hence not recommended for production environments.
  2. Uses standard cron expressions without special syntax. For example, configuring a job to run on the last day of every month is possible but not trivial.

Kubernetes “CronJob”

Kubernetes offers a time-based job scheduler, which runs scheduled tasks in Docker containers. To run a Kubernetes “CronJob” you need a Kubernetes cluster and a Docker image you want to run. You define the task’s schedule using regular cron expression and Kubernetes will take care of the execution. Kubernetes “CronJob” can be good for periodical training of machine learning models, as usually, a model’s training is a resource-intensive operation.

Pros

  1. Use computational resources of the Kubernetes cluster. With proper auto-scaling, you can run compute-intensive tasks once in a while without having a dedicated server just for this purpose.
  2. Monitoring — if proper monitoring is configured for Kubernetes, it can be used to monitor the Kubernetes “CronJob”s as well.

Cons

  1. Requires Kubernetes infrastructure and knowledge — not always available and sometimes an overkill.
  2. Requires fundamental knowledge of the Docker infrastructure.
  3. Uses regular cron expressions without support for non-standard scheduling expressions.

Jenkins

Jenkins allows you to schedule complex pipelines, in addition to stand-alone tasks. Despite that Jenkins is capable of running any task, it is not recommended for “business logic” functionality, as it is intended to run CI/CD pipelines only. A classical Jenkins scheduled CI job would be running an end-to-end test once a day.

Pros

  1. Easy to monitor —The Jenkins UI displays all executed job statuses. Additionally, it is possible to set notifications for job failures using various plugins.
  2. Easy to configure in an existing Jenkins server.

Cons

  1. Requires an existing Jenkins server.
  2. Usually should be used for CI/CD pipelines and not business logic.
  3. Uses regular cron expressions without support for non-standard scheduling expressions.

Serverless

When you want to run small business logic tasks, consider using serverless services such as AWS Lambda or GCP Functions. These services allow you to run a scheduled task without worrying about the server-side. Additionally, the AWS non-standard cron syntax supports advanced wildcards such as the last day of the month, run only on weekdays, etc.

Pros

  1. Easy to use and doesn’t require handling the server and resources aspects.
  2. Good visibility and easy monitoring of executed tasks. Supports configuration of failures notifications.

Cons

  1. Requires cloud services such as AWS or GCP.
  2. Function run time is limited (AWS — 15 minutes, GCP — 9 minutes) hence this solution is not suitable for long-lasting tasks such as machine learning model training.
  3. Supports limited programing languages. (AWS — Java, Go, PowerShell, Node.js, C#, Python and Ruby | GCP — Python, Node.js, Go)

Airflow

“Airflow is a platform created by community to programmatically author, schedule and monitor workflows.” — Airflow website

Airflow and other similar solutions such as Kubeflow, Luigi, etc allow you to schedule tasks as part of a more complex workflow. You can generate pipelines with multiple tasks that depend on each other. It is mostly used to create complex ETL (Extract, Transform, Load) tasks and should be used for more than a single scheduled task. You can use these kinds of platforms to schedule daily pipelines that implement your business logic. For instance, create a pipeline that will move records to a data warehouse, analyze them and send a periodic email with relevant statistics.

Pros

  1. Allows creation of complex workflows and the schedule of tasks that depend on each other.
  2. Good visibility and easy monitoring of executed tasks. Supports configuration of failures notifications.
  3. With Kubernetes integration, can support computational-intensive tasks.

Cons

  1. Requires to setup Airflow and maintain it.
  2. Specifically, Airflow requires Python skills.

Conclusion

There are many ways to schedule a task. Examine your needs, and choose the most suitable solution that fits them. A proper solution will schedule your tasks using adequate resources while supplying proper monitoring.

Try out the different solutions — you’ll probably find yourself using more than one, based on different use cases.

Got another useful solution for task scheduling? Share with us in the comments!

Thanks to Dana Blanc

Michal Yanko

Written by

Software Developer @ Via

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade