(cc image from flickr)

How not to spam your clients with thousands of emails

The background

This is a story of the time I accidentally sent a few thousand emails to a small subset of clients, resulting in clients receiving multiple copies of the same email, ranging from 6 to 72 emails per person…

I was tasked with creating an email which was meant to be sent on the 15th day of every month at 2:00PM. Simple enough right? The implementation of the business logic is quite irrelevant, what’s relevant is the scheduling. In cron syntax the above schedule could be written as follows:

0 14 15 * *

The system is a Rails application, so I was able to rely on the plethora of gems out there to handle scheduling for me. In fact, the system already included a scheduling gem, so I quickly went to the Whenever documentation. As it turns out, Whenever supports cron syntax, so I popped in the following line into the scheduler.rb file.

every '0 14 15 * *' do
# Called my email scheduling class here

The code was reviewed, we even had a look at the documentation to verify everything was correct and we shipped.

This is where things started going wrong.

40 minutes after the deploy a colleague pointed out that he was seeing a spike in emails sent for a template we had never used before. The template was obviously the newly deployed email. After confirming this was the case, we rolled back.

The investigation

Before we start looking at the details of what went wrong, I want to pre-empt what I have been thinking the whole time since the event. “You’re dealing with emails, make sure you check the metrics after you deploy. Make sure you haven’t done something stupid and aren’t sending spam.” Yes, do that. Always do that!

So, what happened?

Turns out that we were using a scheduling gem, Clockwork. I was so used to using Whenever in previous projects that I didn’t even think to check whether we were using it in this one. I must have also been very convincing in my belief as I managed to convince my code reviewer that we were using it too!

There are slight differences in the supported syntax of the two gems. In this project Clockwork’s rules were in a scheduler.rb file, Whenever’s default file is schedule.rb. When you look at both files they are ever so slightly different, but very much the same. For example, both gems start a job schedule with keyword “every”. The most telling line for what gem you’re using is the opening block in Clockwork which specifies the gem, Whenever does not have this. In a big file it’s easy to omit. The biggest difference is that Clockwork does not support cron syntax.

It’s also important to note that the reason Clockwork was used, is because the whole application runs on Heroku, which does not properly support crontabs.

I initially thought that the scheduling code was not throwing an error, I was mistaken. What was happening instead was the dreaded silent failure:

  1. Clockwork process started up on Heroku
  2. Clockwork process ran my job
  3. Clockwork process failed after running my job due to a syntax failure
  4. Clockwork process restarted (back to point 1)

This kept happening for the 45 minutes before we rolled back. It was also silent as the bug tracking system used by the application wasn’t connected to the separate process. I only found the failures after searching through the logs for the error I was getting locally:

comparison of Fixnum with String failed (ArgumentError)

I guess it turns out that Clockwork is not fully to blame for receiving erroneous params and not complaining. I do worry that it ran my job first and then complained. It also makes sense that it does not support cron syntax as it does not actually create a crontab, it’s just a process spinning jobs.

Lessons learned

I like to think that it’s OK to make a mistake once as long as you learn from it and do not let it happen again.

I have certainly learned my lesson in checking if emails are being sent unexpectedly when working on emails. I will also double check that the gems I am using are the ones I think I am using. I have also spotted a black hole around our clockwork bug reporting, which I intend to fix so that we are immediately aware should a scheduled task go haywire.

In the end all was well. We all learned something. The clients weren’t exactly happy, but after a sincere apology I hope they forgave us.

Be vigilant when scheduling and emailing. Happy coding!

Interested in making an Impact? Join the carwow-team!
Feeling social? Connect with us on Twitter and LinkedIn :-)