This is an article about the OutSystems Platform, and specifically, about Timers (and when they run).
What is this Timers you speak of?
Timers are the OutSystems Platform’s batch jobs¹. They run asynchronously, are allowed to run for a long time, and are typically used to churn away at data. They are invaluable to any serious business using the OutSystems Platform. Lately I’ve seen much confusion about how Timers are handled by the Platform, when they are scheduled, and how often they are restarted in case of failure, so I decided to write an article that seeks to clarify all that.
One of the central components of the OutSystems Platform is the Scheduler. The Scheduler is, ultimately, a Windows Service called OutSystems Scheduler Service. In Service Center, the Environment Configuration page of the Administration menu shows its status: there should be a happy green check mark below it.
In order to describe what the Scheduler does though, we first need to take a look at Timers themselves, and how they can be configured.
Timers, timers everywhere
Timers are created in Service Studio like any other logic. To create a Timer, select the Processes Tab, right-click the “Timers” folder and select “Add Timer”.
The first thing to notice after creating a new Timer is that it has several properties:
Worth taking a closer look at are the following:
- Name: the Timer’s name. It is used as part of the Timer’s Wake Action;
- Action: this is the Server Action that is called when the Timer is started. It contains the logic of the Timer. If you select New Server Action, Service Studio creates a new Server Action, with a name equal to the Timer’s name. If you subsequently change the Action’s name, the Timer’s name is also updated. If you then change the Timer’s name, luckily the Action’s name stays put;
- Schedule: this is the Timer’s schedule, which defines how often the timer will run and when. Note that any time specified here is the local time of the server, so it’s a good thing to know in which time zone it runs;
- Timeout in Minutes: this is the maximum time allotted to the Timer. If it runs longer, it will be killed of;
- Priority: the timer’s priority. When there are more Timers ready to run than the maximum amount of Timers that are allowed to run consecutively (of which more later on), the Timers with the highest priority get to run first.
The initial schedule is set via the schedule editor pop-up. This allows for easy creating of a schedule. Note however that the schedule can be overridden later on in Service Center.
The “When Published” option allows the Timer to run after its Module has been published, which is ideal for stuff like bootstrapping.
Once the Timer logic has been written, tested and deployed, it’s ready for the Timer to run. If a schedule was specified during development time, a newly created Timer will get that schedule. If no schedule was specified, the Timer won’t run at all, until you manually specify a schedule in Service Center. To do so, navigate to the Module’s page (via Factory / eSpaces), click the Timer tab, and click the specific Timer.
On the Timer’s details page, you are greeted with a myriad of options.
At the top, there’s the Timer’s Status: it shows whether the Timer is activated, i.e. ready to run, or deactivated, i.e. it won’t run, no matter what the schedule says. Next, there’s the Default Schedule. This is the schedule that was set at development time as explained above. The Effective Schedule is, as the name implies the schedule that’s currently active. It shows the same parameters also shown in the Service Studio pop-up (but with slightly different names to add to confusion, e.g. the pop-up’s Occurs is the Servive Center’s Days), and like the pop-up, the available information changes with the Days setting.
A neat option that Service Center offers but Service Studio does not, is to generate time intervals at which the Timer runs, e.g. every ten minutes. The minimum interval is five minutes, which makes sense, as Timers are meant for batch jobs that take some time.
The page also shows when the Timer ran last, and how long it took, and also (if it isn’t deactivated) when the next run is scheduled. If it is actually running right now, that’s also shown (Running By is the date/time when it started, not the person who started it).
Below all the information there are five buttons. These are:
- Apply: save any changes you made to the schedule;
- Cancel: cancel any changes you made;
- Refresh: refresh the page. Useful for refreshing the “Last Run”, “Next Run” and “Running Since” information;
- Run Now: directly start this Timer, regardless of schedule or status;
- Activate/Deactive: as the name implies, activates a deactivated Timer, or deactivates an active Timer.
Apart from via the schedule and the “Run Now” button on the Timer’s details page, it’s also possible to run a Timer programmatically. This is done via the Timer’s Wake Action, which is shown below it in the Tree.
You can drag the Action onto the canvas, or select it via the Select Action pop-up, where it is available below the Timers folder.
This comes in handy when you want a user to be able to start a Timer, e.g. when they uploaded a large Excel file that needs to be processed, or inside a Timer Action itself, when on pending demise, it restarts itself and then Ends. The way these “manual” started Timer’s interact with the scheduled ones is described later on.
Configuring like a boss
Now that we know there’s a Scheduler, and how to create Timers and schedule them, let’s take a look at how the Scheduler schedules Timers.
First, there’s a few configuration options that need to be taken into account, the first of which is the number of Timers that can run concurrently. The Platform imposes a limit on the number of different Timers that can run at the same time, for performance reasons. You wouldn’t want twenty badly scheduled Timers to run all at once, causing your server to come to a grinding halt. By default, three Timers may run in parallel. Note that these are three different Timers. The same Timer can never run more than once at the same time. If you need concurrent data processing, BPT is the way to go. If you want to configure a different number, you need access to the Platform Configuration Tool. On the Scheduler tab, there’s a Max. Concurrent Timers setting².
Secondly, there’s the number of execution attempts. This number specifies how often the Timer is run in case it fails while running. “Failing” here means either an uncaught Exception occurs, aborting the Timer prematurely, or the time is terminated by the Scheduler because it didn’t finish before the configured time-out. In either case, if the number of execution attempts is set to a value larger than 1 (default is 3), the Scheduler schedules the Timer again, until it either succeeds or the number of attempts is up.
The number of execution attempts is configured in Service Center, on the Environment Configuration page (Administration menu), Timer Execution Attempts setting.
Schedulin’ like there’s no tomorrow
Like discussed above, every Timer that’s enabled and has a schedule is up for scheduling by the Scheduler. As soon as a a Timer becomes Active (or is published for the first time), the Next Run date/time is calculated and stored. The Scheduler frequently looks (several times a minute) at all Active Timers’ schedules, and determines the ones that are elligible for running by comparing the Next Run date/time with the current date/time. Any Timer whose Next Run is equal to, or older than, the current date/time is put in the Scheduler’s internal run queue.
Next, the Scheduler looks at the number of Timers in the queue, sorted by Priority (lowest first) and Next Run date/time (earlier first). If there are less Timers currently running than the maximum number of concurrent Timers configured, the first eligible Timer is scheduled for running immediately, and the Scheduler will instruct the Platform to run it³. After the Timer has been started, the next elligible Timer is considered (and started if possible), and so on.
Note that, keeping in mind the above rules, the actual time a Timer runs is only a “nice to have” time — there’s no guarantee the Timer will actually run at the Next Run date/time, as the Scheduler schedules Timers best effort; the actual date/time might deviate significantly (especially if there are more than the configured consecutive number of Timers eligible).
When a Timer successfully finishes execution (i.e. on time and without throwing an uncaught Exception), the Scheduler calculates the Timer’s next run date/time (if the Timer has a schedule that’s not “When Published”) and updates its Next Run accordingly (after which it is again partakes in the elligibility checks as outlined above), unless there’s already a Next Run set that’s not the one used for starting the current Timer. The latter is important in case of manual Time runs (see below).
When a Timer does not finish successfully, the Scheduler checks how many attempts have been already made to run the Timer. If the Timer has reached its maximum number of execution attempts, the Scheduler gives up, and — like for succesful executions — calculates and updates the Next Run.
Mixing it up
The above is all rather straightforward, but there’s one situation that’s slightly more complex: the earlier described “manual” Timer runs. The way these are handled is pretty trivial though: when a WakeTimer Action is executed, the Next Run is overridden with the current date/time, making the Timer immediately elligible for running. After the Timer is finished, the Scheduler calculates the Next Run just like after any other Timer run (unless there was another manual run scheduled, as explained above). This results in the following behaviour:
- When a Timer is running because of its schedule, and a manual run is requested, the Next Run is set to the current date/time. When the Timer has finished, the Scheduler detects the Next Run has been altered since the Timer was started, and therefore doesn’t recalculate the Next Run. This makes the Timer directly elligible for running (so it will start again soon after finishing). When the “manual” run has finished, the Next Run will be set according to the Timer’s schedule (unless, during its run, another manual run is requested).
- When a Timer is running, starting it manually multiple times just updates the Next Run multiple times. As soon as the Timer has finished running, it becomes elligible again, and will be run one more time (unless, again, during its run, another manual run is requested).
- When a Timer is scheduled, but before the date/time of the Next Run a manual run is scheduled, the Next Run will be overwritten. When the Timer finishes, if the “old” Next Run is still in the future, the Scheduler will again calculate and store that date/time.
- However, if the old Next Run is in the past (which means the manual run was still running when normally the scheduled run should have taken place), that run is skipped, as the Scheduler will calculate the Next Run based on the schedule, and therefore always in the future.
Run Timer, run!
Now that we’ve established how and when Timers run, you may ask yourself “but where do they run?” This seems like a trivial question: if you have a single front-end server, that’s where they run. However, in case you have multiple front-end servers, it becomes less trivial: it is possible to designate a single or multiple servers for running Timers, while others won’t run Timers. To configure this, Service Center has the Front-end Servers configuration page (Administration menu). When you access the page, it has an overview of all configured front-end servers, showing whether they are enabled, whether they execute Timers and Processes, and whether they send e-mails.
When you click on a server name, Service Center shows the details page, allowing you to configure the various options.
Note the “Execute Timers” checkbox: as the help text below it says, if it’s checked, the server can run timers, if it’s not, it can’t. That is to say, the Scheduler service still runs on the front-end, but it knows it’s not allowed to schedule Timers.
If you have multiple front-ends that can run Timers, you never know where they end up running. The Schedulers of these front-ends collaborate to run timers, so that not a single server will end up running all Timers. The precise algorithm is a black box, and probably only known to the OutSystems engineers maintaining the Scheduler service (though it might be as simple as first-come-first-serve).
So why would you want to deny a server to run Timers? Well, most obviously for performance reasons. A front-end server that’s busy serving pages to users may not be the best server to run heavy batch jobs on. So it is common practice to have one or more servers dedicated to serving user pages, and another one or more dedicated to running Timers.
I spy with my little eye…
In an ideal world, everything is running a-ok, and there’s little need to keep an eye on Timers: if they are programmed well, and performance of the infrastructure is fine, they just churn away and do what they’re destined to do. However, in such an ideal world, there are probably also rainbow-coloured unicorns frolicking about — and unfortunately, I’ve never come across them. In reality, it is pretty important to be able to keep an eye on Timers and whether they’re running fine.
There’s several places in Service Center where you can see what’s going on. First and foremost, there is of course the Error Log. If a Timer runs into trouble, the Error Log shows the offending error, and the Scheduler Service will tell whether a retry will be scheduled or not⁴.
Clicking on Detail may tell you a little more about what caused the Timer to abort.
Secondly, there’s the already mentioned Environment Health page. Apart from showing whether the Scheduler is up and running, it also shows a list of twenty Timers that are next in line for running, and any Timers that are actually running.
Thirdly, there’s the Timer Log. The Timer log shows all Timers that have finished running (whether succesful or not), with their starting time, original scheduled time (“Should Have Run At”) and their Next Run. If they were not succesful, an “error” link is present, which redirects to the same page as the one from the Error Log.
¹ Or, if you prefer, cronjobs or tasks.
² Since this tool runs on the server itself, you cannot access it if your Platform is cloud-based. In that case, contact OutSystems Support if you need to change it.
³ Since the Scheduler is a Windows Service, it cannot start Timers directly. Instead the Platform exposes a dedicated timer web service, which is called by the Scheduler when it needs to start a Timer.
⁴ Note that the way the Scheduler schedules retries is a bit odd: since the property that specifies the maximum tries is called “Timer Execution Attempts” (as shown earlier), and the first time the Timer is started is also the first attempt, the number of retries is actually one less than the number of attempts (the first retry is, after all, the second attempt). Nevertheless, the logging shows that the Scheduler actually schedules that number of retries, but at the start of the last retry directly aborts, since the maximum number of retries has been reached (as is also visible in the logging, see the screen shot), after which the next run is calculated (that is the next “normal” run, not a retry).