How To Write Cron Jobs That Work
A typical process for writing a cron job:
10 Write Cron Job
20 Hit API endpoint with a couple hundred of requests as I wrestle with urllib and JSON decoding and passing through the right variables
30 Don’t write tests because ‘it’s only a cronjob’
40 Push to server (something that’s never automated)
50 Add to crontab, cross fingers no-one changes their API or errors sneak in some other way
60 Inevitably have something break anyway. Don’t get notified until customers start complaining about loss of data
70 GOTO 10
This is a broken process. These days, most tech companies have solid processes for writing and shipping code. But when it comes to cron jobs we throw all caution in the wind.
And it’s not just writing cron jobs that’s broken. Maintenance for the jobs is usually an ad-hoc process. After writing the jobs, you are now also the sole maintainer of a poorly documented and tested piece of code that you wrote in two hours on a Friday afternoon.
After years of doing this and being frustrated I found a couple of ways of improving this process. Some of these are a bit unusual and possibly not very obvious, but they’ll result in cron jobs that actually work for you and your company.
Don’t write it
That’s right. From a business perspective, a lot of cron jobs aren’t actually worth the trouble. Sure, in the beginning it sounds like it’ll only take a couple of hours to write, if that. And after that, imagine how we never have to do this routine task by hand again!
This is a trap a lot of developers, including myself, fall into often. All. The. Time. We think we can automate a mundane task, but in the process spend a lot more time automating it than we would ever save.
The writing, debugging, testing, setting up and extra lines of code generated. Often this process isn’t worth it. Especially when you convert time spent into dollars, a company is usually better off NOT writing the cron job, and instead doing it manually or not at all.
Most cron jobs don’t actually add that much value. Does your sales-team really need a revenue email every morning, or can they just log in to the admin area whenever they need that figure?
Having said that, there are many situations when writing a cron job is a good idea. Some jobs need to run with great frequency or at annoying times (anything that’s sub-daily or over weekends/holidays). Some jobs are a large part of the infrastructure and need to be run, no matter what. If any of this is true, keep reading.
I know what you’re thinking.
But hold on for a second. Everyone knows the immense value of tests and TDD in customer-facing production software. But it always feels like a waste of time writing them for internal things like cron jobs.
Even a TDD master that wouldn’t change the colour of a button without writing a test first tend to forget tests for “this one quick job I have to write.”
The big problem is that these quick, one-off jobs usually end up being core pieces of infrastructure. The knots (or worse, duct-tape) that ties everything together. Treating these bits of code as second-class by not writing tests for them means making your app that much more unstable.
But what if the job you’re about to write really isn’t that important? What if this job only emails a really simple statistic or does some trivial change in the system.
Even then tests can be valuable for two reasons: ease of development and safety. Ease of development is an obvious one if you’re familiar with TDD, but is important to repeat. Especially because cron jobs usually tie multiple different API’s together. Get a bit of information from a SaaS provider, compare that to a database, mail the info. That’s at least three.
To make sure you don’t get stuck in that annoying loop where you’re making hundreds of API requests just because you keep messing around with the urllib library mock out these services in the test. Then, when you plug in your API keys it will only take a few tries to get your cron job running, without making many unnecessary requests.
The other reason writing tests for trivial jobs is that most jobs aren’t actually that trivial. I’ve seen many untested jobs that make big, unsanitised requests to a production database. Most, if not all, cron jobs don’t have a sophisticated QA process, and a lot of times it’s a single developer pushing to master that’s making changes to these jobs. Everyone makes mistakes, but making changes to a bit of code that has unrestricted access to a production database and that is untested is tainting the Tech Duty Gods.
Don’t use Cron Tabs
As founder of Cronner I am obviously biased. But there is a good reason why I build it. Cron tabs lack many obvious features required to build stable, reliable periodic jobs. Cron tabs lack severely in monitoring, fallback, scheduling and other important things.
Our production software runs on multiple servers with fallback, backups and solid monitoring. Then why do we tend to install our cron tabs as an after-though, with minimal oversight and unstable infrastructure? It’s time that we start treating our cron jobs as first-class citizens.