Although event-driven is great, sometimes you want to use a good old scheduler to execute tasks on a regular basis, but how to prevent multiple instances of a service doing the same thing?
At some point, a couple of years back, when I was coding a lot more than I do these days, our team needed a simple solution for executing tasks on a scheduled basis. We did not want to use any outside scheduling tools and decided to go for FluentScheduler. The solution described below should work for other schedulers as well.
FluentScheduler works with tasks. A task is basically a piece of code that you want to have executed when the scheduler decides it’s time. In our case that was checking the status of a transaction in an external system and updating our database accordingly. No rocket science, so quickly built, tested and deployed. As this service was important for our business and needed to be always-on, we had several instances running. If one fails, another one takes over. It worked like a charm.
But so did the schedulers.
As each service had contained the scheduler logic, all of them were checking the external system and updating the database at roughly the same time causing more traffic and errors due to locking issues, when two instances tried to update the same record. We needed a quick fix and came up with a simple, but effective one: use a database to check whether an instance is allowed to run the scheduled tasks.
The idea is based on a simple table, containing a timestamp, which stores the time of the last run and a process name. The name can technically be omitted if the table is used for only one process. When a scheduled task starts, it checks whether it is allowed to run by calling a store procedure with the process name and interval as arguments. If the time span between now and the last run is less than the interval, the stored procedure returns 0 (not allowed).
If the interval has expired, the procedure will set the time span to the current time. It does this within a transaction, thus causing the row to be locked. The lock prevents other clients from accessing the record at the same time, so even if multiple instances call the stored procedure at the same time, only one of them will be allowed to execute the scheduled task.
We implemented this in Microsoft SQL Server, but any database that supports this kind of locking mechanism will do the trick.
Keeping the code clean
As I wrote earlier, our scheduler works with tasks. To keep things neatly separated, we introduced a new type of task for the scheduler: the MutExTask, which can be used as a decorator for any other kind of task. It contains the logic to check whether the task is allowed to be executed and if so, executes the inner task. The following fragment illustrates this.
The executionAllowedQuery executes the stored procedure mentioned before. It takes the interval and the name (the type name of the inner task) as parameters and returns the result of the call as an object. When execution is allowed, the inner task gets executed.
Although this is probably not the best solution for all problems, it is a simple one and since many services use a database anyway, it is relatively easy to implement.