Crontabs in Celery

Published in

The Andela Way

7 min readJul 26, 2018

In this article, we pick up where we left off with periodic tasks. We will be introducing crontabs; an advanced scheduling solution to periodic tasks.

Crontabs…periodic tasks on steroids

Timed periodic tasks just won’t cut it in complex use cases. It is in these kinds of scenarios that crontab solutions excel.

The Celery crontab is a time based job scheduler. It schedules tasks to run at fixed times, dates or even intervals in an elegant, flexible manner. The Celery implementation of crontab heavily borrows from the Unix cron which is extremely efficient at all matters scheduling.

The Crontab Syntax

In order to use crontab in Celery, we need to first import it.

from celery.schedules import crontab

crontab has several keyword arguments including minute, hour, day_of_week, day_of_month, month_of_year among others.
Each of the key word arguments above can accept integer values as arguments. The range of these values is restricted e.g. 0–59 for minute , 0–23 for hour, 1–31 for day_of_month etc.
Each of the key word arguments above can also accept crontab patterns — these are usually in string format. This is where the power of crontab truly lies and that is why we are dedicating a mini-section below.

crontab patterns

As we mentioned above, crontab patterns are string values. They can be used in the following ways;

To represent every possible valid value in a keyword argument using an asterisk. This is the default value for most of the key word arguments.

# a job is scheduled to run for every minute of every day
crontab(minute="*")# a job is scheduled to run for every minute between 1am and 2am
crontab(hour=1, minute="*")# a job is scheduled to run for every first minute of every hour
crontab(hour="*", minute=1)

To represent a range of valid values.

# a job is scheduled to run for every minute in the first quarter of 
# each hour
crontab(minute="0-15")# a job is scheduled to run at 1am on weekdays only
crontab(day_of_week="1-5", hour=1, minute=0)# a job is scheduled to run on the first five days of every month at # 7:30 am
crontab(day_of_month="1-5", hour=7, minute=30)

To represent a subset of valid options using a comma delimited string.

# a job is scheduled to run on the first day of each yearly quarter
# at 8:30am
crontab(month_of_year="1,4,7,10", day_of_month=1, hour=8, minute=30)# a job is scheduled to run on weekends at 12:15 and 00:15
crontab(day_of_week="0,6", hour="0,12", minute=15)

To modify scheduling intervals. This is done using the forward slash character.

# a job is scheduled to run every five hours
crontab(hour="*/5")# a job is scheduled to run every seventeen minutes
crontab(minute="*/17")# a job is scheduled to run every two minutes during the second
# half of each hour
crontab(minute="30-59/2")# a job is scheduled to run at the top of every hour from 6am to 6pm # and runs after every three hours thereafter
crontab(hour="*/3,6-18")

You may have noticed that crontab patterns allow us to form numerous permutations. This kind of flexibility is what allows us to schedule jobs to run at the precise moment.

Additional crontab information and crontab examples can be found in official documentation.

the crontab parser

Celery has a wonderful crontab utility called crontab_parser. It accepts the maximum and minimum (optional) values of a crontab field as arguments. From there, its parse method can then be called with a crontab pattern to return all the options from the aforementioned pattern.

>>>from celery.schedules import crontab_parser>>># minute="*/17"
>>>crontab_parser(60).parse("*/17")
{0, 17, 34, 51}
>>># hour="*/3,6-18"
>>>crontab_parser(23, 0).parse("*/3,6-18")
{0, 3, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 21}

UTC

Now that we have done a walkthrough of Celery crontab, we need to talk about time. Local time. It’s a component we cannot do without in any job scheduling implementation. When Celery beat schedules jobs to run in the future, it needs to know when the present time becomes that future. It needs to know when to trigger the job.

TL; DR — Use Celery > 4.2

Before we go any further, you should be aware that Celery, like any other software, is not 100% bug free. Celery version 4.1 has a documented issue; it always uses UTC time regardless of the changes made to the timezone and/or enable_utc settings. A quick perusal of this issue on their GitHub repository suggests that this problem has been addressed in version 4.2.

Now that we’re done with introductions, let’s go at it with an example.

Happy Birthday!

In this example, we are going to create a task that sends birthday messages to employees at 7am on their birth date. First, we need to setup a database. We are going to use MySQL because it has;

a community server that is provided free of charge.
a sample employees database that has the kind of data we need for this example. This database is also provided free of charge.
download and installation instructions that are pretty straight forward.

MySQL Installation and setup

Download and install instructions for the MySQL community server can be found here.

After installation, we need to load up the employees database from archived data. This page documents how to do this.

The app

As with the two previous articles, we will be housing our code in the simple-celery folder.

Navigate into the folder and activate the virtual environment.

# navigate into the folder
cd simple-celery# activate the virtual environment
source simple-env/bin/activate

Now that the virtual environment is active, we need to install a MySQL connector. This will be the interface between Python and MySQL. We install it as follows;

pip install mysql-connector-python

Next, create a file called birthdays.py and add the following contents.

# birthdays.pyimport datetimeimport mysql.connectorfrom celery import Celery
from celery.schedules import crontabapp = Celery('birthdays', broker="pyamqp://guest@localhost//")# disable UTC so that Celery can use local time
app.conf.enable_utc = False@app.task
def birthdays_today():
    conn = mysql.connector.connect(
        user='root', database='employees', password='')    curs = conn.cursor()    today = datetime.datetime.now()# Update November 2020: Do not execute queries in the manner shown
# below (especially in a production environment).
# This method of querying will introduce an SQL Injection
# vulnerability in your code.
# Please read https://bobby-tables.com/python for more info.    query = """SELECT first_name, last_name FROM employees
    WHERE month(birth_date)={0} and day(birth_date)={1};""".format(
        today.month, today.day)    curs.execute(query)    for (first_name, last_name) in curs:
        print(
            """            Hi {0} {1},            We would like to wish you a great birthday and a             memorable year.            From all of us at company ABC.
            """.format(first_name, last_name)
        )    curs.close()
    conn.close()# add "birthdays_today" task to the beat schedule
app.conf.beat_schedule = {
    "birthday-task": {
        "task": "birthdays.birthdays_today",
        "schedule": crontab(hour=7, minute=0)
    }
}

We can run the beat scheduler for the app above using the following command. This will schedule a task to run every day at 7:00am;

celery -A birthdays beat --loglevel=info

While the beat scheduler is running on the main process, we also need to run the worker processes on a separate terminal window.

# navigate into the folder
cd simple-celery# activate the virtual environment
source simple-env/bin/activate# run the celery worker processe
celery -A birthdays worker --loglevel=info

When the time is right, the beat scheduler will place the task in a message broker queue and the first available Celery worker will process this task.

A successful execution of the task will produce the happy birthday output similar to the one shown below.

...
...
[2018-07-06 07:00:00,632: WARNING/ForkPoolWorker-2] Hi Shirish Stellhorn,We would like to wish you a great birthday and a memorable year.From all of us at company ABC.
[2018-07-06 07:00:00,633: INFO/ForkPoolWorker-2] Task birthdays.birthdays_today[84c61685-ea72-4a00-9a3f-0454f99acf4a] succeeded in 0.5842522100065253s: None
...
...

This message will be displayed for every individual in the employees table whose birthday falls on the current date.

In the real world, this example would probably be sending emails to the respective individuals instead of just displaying the message on console.

Conclusion

Our Introduction to Celery series has come to an end. This was just a high level overview of an interesting package for an equally interesting language. We have dozens of other topics that we couldn’t cover in an introductory series e.g. monitoring, testing and daemonization among others. The user guide lists all these topics for those who wish to dig deep.

As you peruse the official Celery documentation, you may come across some content that needs to be updated; this may vary — from simple spelling mistakes to misleading information from the examples in the docs. Don’t be shy, let them know by raising an issue on their official GitHub repository. Better still, fix the problem and raise a pull request. This is how open source products survive and thrive.

As usual, the code from this article is available on GitHub.

Happy coding!

Do you need to hire top developers? Talk to Andela to help you with that.