Django, Scheduled Tasks & Queues (Part 1)

Ben Cleary
9 min readDec 15, 2017

--

This initially started out as one long write up, but i figured it would be easier and more manageable to split it into a few pieces. I hope you enjoy it, or find it useful.

Preface

So lets imagine we work for a company that hosts online auctions, these auctions will run for x amount of time and when they finish it will email the customer and winning bidder, such as Ebay. Sounds simple right?

Well lets add some complexity and say we want it to happen automatically in the background at anytime of day or night, 365 days a year indefinitely. We also want to be able to have a report of emails sent, and keep a log of jobs run.

We have been tasked by our CTO to come up with a solution that is not only stable and secure but also scalable.

We need the following actions to be completed:

  • Close auctions when the specified end date is reached
  • Send email to auction owner stating last bidding price
  • Update auction owner when a bid has taken place (for the example we will focus only on this task)

and all of these should adhere to the following also:

  • Run unattended
  • Logging and Monitoring
  • Extensibility
  • Easy to maintain

Note

The application we will build will be a basic system that would allow a user to bid on an auction, the application will have no front end functionality, we are not concerned with custom users, views or anything else we are just looking purely at the architecture of scheduled tasks and how it integrates with Django.

Our Test Application

Before we dive in lets setup our simple application.

This code is intended for learning purposes and would not be sufficient for production without a lot of extra work.

We need to setup a few models, signals and configure our admin file. We will start with the following models:

  • Auction (with custom manager)
  • Bids (with custom manager)
  • User (Default Model)

This will give us the core functionality that we want to simulate. We also need to setup some signals to trigger some events to update the values on the auction model.

Its a shameless plug but you can get the idea of signals here on my article regarding user profiles.

You can use the code in these gists to get you started.

Our above files breakdown the functionality we want to simulate, so let’s get this migrated, create a superuser and put some data in using the Django Admin.

So we have 3 auctions all of which are valid at the moment and have no bids, We can quickly test that everything is wired up by putting a bid on one of the items.

Demo application working thus far

Now we have our application setup, we can begin looking at a solution for our task.

Let’s check our toolbox

Your developer toolkit, always good to have a sort out every now and then

Lets see what comes with Django natively before we decide to bring in another tool (good practice to stop and have a look what you have before bringing a third party tool into your stack).

Management Commands

Management Commands and cron jobs can be seen as cogs in the big machine, we just need to oil them

Management Commands are code that is executed through the manage.py file, examples of these would be runsever, makemigrations and migrate. One of the advantages of these is that is has access to your Django environment, meaning we can tap into any aspect of our application.

Enter BaseCommand

© 1973 — Warner Bros. All rights reserved.

Lets initially plot how we want it work and what we can do out of the box to get it working as per the specifications of the CTO.

We want it to work like the following:

bid -> update_auction -> add_email_to_queue -> task_picks_up_and_sends

Based on this we need, the following tables:

  • table for bids
  • flag on bid records to show email has been sent
  • column on bid records to record when it was sent
  • table to act as a queue

So lets amend our existing bid model and create a model for our email queue:

We’ve added an email_sent boolean field and email_sent_time field to the bid model, we also created a simple queue table, this does not need too much information as rows will populate and be deleted.

We want our queue to be populated automatically, we need to amend our signals.py file to do this. To keep our signals.py file neat and tidy, we will create a wrapper class which will contain the logic we require.

I went ahead and created a package called crunchy_classes and inside there a file called bidding_actions.py

So we have everything in place, we can now test that all our signals are firing correctly and the email queue is filling.

So lets put together our command, to build our own we need to configure a specific folder structure, this is as follows.

myappname/
management/
__init__.py
commands/
__init__.py
__init__.py

Inside our application folder, we need a management folder and inside that we need a commands folder, both of these are should have the __init__.py file in it for python to recognise it is a package.

We make a new file in the commands folder called the name of the command you want, so in this case i have called mine process_emails.py

Side note Django will automatically register a management command for .py files that aren’t prefixed with an underscore

Lets get the code out the way first:

The class Command extends from the BaseCommand class provided by Django in its management package, from here we override

  • add_arguments
  • handle
  • help

add_arguements(self, parser):

This method is the entry point for subclassed commands, this allows us to add parameters to our management command. This does not have to be overriden by default.

handle(self, *args, **options):

The handle function is what is executed when the command runs, your logic will sit in here, it takes *args and **options as parameters and must be implemented in any subclassed commands.

Our handle currently checks if we have an id passed to the command, if true it will process send_email on that one email, this is ideal for testing functionality, if no arguments are passed it will process on all emails in our queue.

help

This is a property we override to show a help string to describe what the command does.

Custom Methods

We have added two methods called send_email & update_bid_row,

send_email(email):

this would be where an email is sent from. We use a try statement to catch and raise a CommandError to say present a reason it could not send.

updated_bid_row(bid):

This updates the fields on the bid item to say when and if an email was sent.

We need to add some data to these tables, i have added a few entries into the email queue. We will first test the command on one of our emails

python manage.py process_emails --id=2

We shuold get a “Email sent to — admin@admin.com” in the terminal, we can now run this command to process the rest of them.

python manage.py process_emails

OK, great so that’s working, but we don’t want to have to do this every day, so we need to put this into a cron job to automatically run.

The following only applies to Linux based systems:

We need to edit a file called the crontab this is the central store of jobs that are scheduled to run.

crontab -e

We will set this to run every 1 minute, for a detailed breakdown of cron job setups a quick Google search will get you where you want to be.

*/1 * * * * /var/path/to/venv/bin/python /var/path/to/my/app/manage.py process_emails

Save your cron tab file, and that’s it the server will manage the processing of the cron job.

At this point lets review the our CTO requirements and compare against this solution:

  • Update auction owner when a bid has taken place [YES]
  • Run unattended [YES]
  • Logging and Monitoring [MAYBE]
  • Extensibility [Yes]
  • Easy to maintain [Yes]

An area we can improve would be the logging and monitoring, this can be enhanced to be a more effective tool.

Currently we can use the system cron log from inside the server, this will show its running, and show the command executed but not the outcome. Sticking with the idea of using Django lets make a management app inside our project, which will simply act as a log for our custom commands.

We will start by putting together a new app called in my case crunchy_waffles_ops.

python manage.py startapp crunchy_waffles_ops

Lets build out our models.py and admin.py and get them migrated, registered and visible in the admin panel.

We need a simple model to hold a few details. It will hold the title of the task, the count of objects processed, when it was started and ended and also a state of the task.

Along with this we will make another class wrapper we can resuse, lets make a package called crunchy_classes_ops and a file called management_log.py.

This saves us from writing this code on every command we make. We need to make a few changes to our initial application to make sure it is working, these are:

  • Create an instance of our management wrapper to be used everytime the management command is run
  • Add a property to our management command called title which will be used by our wrapper class
  • Call the methods from the management wrapper

I’ve also added a sleep method to simulate network activity, so we can visibly see the record updating.

Everything is now in place, to test it again, go ahead and add a number of bids to generate emails, and then run process_emails and look at the management model in admin to see a record and refresh it a few times to see it change, or you can view the video below.

Recap

We built our custom management command, implemented a custom log and queue, and tested it now we can present the idea to the CTO.

CTO is pleased with it, he puts it into production after some tweaks, unit & implementation tests, its running fine.

Extending the Concept

We should always look at ways to improve, and our code is no exception. I have two that i thought of:

  • API endpoint for monitoring the management log (i like to build little desktop clients or mobile apps to see whats happening)
  • Not call the manage.py directly, create a bash script to run and implement FLOCK to lock the file to stop duplicate processes running

CTO Update

So the CTO has had this in place for a while now and he likes it, as all the team can make changes as its all built on Django nothing else, but there is a request for distributing the tasks across our infrastructure, realtime processing and more in depth monitoring.

Its at this point we now need to assess what is best and if we need to bring in another tool into our stack.

Enter Celery

Celery describes itself as the following, on celeryproject.org

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

So off the bat we have solved all of the extra features, by looking at this tool but what does this mean for us implementing it and how does it help us?

In part 2 we breakdown Celery, Django and RabbitMQ.

--

--