Time of day based notifications with Prometheus and Alertmanager
Whilst this feature is still outstanding (GitHub issue) I have been working around it with the following solution.
I want to prevent low priority alerts being sent at nighttime.
Prometheus has a
time() function. However, it will only ever return the time in UTC. Being in the UK (specifically timezone Europe/London) this works fine some of the year. But once British Summer Time (BST) takes effect, any solution relying solely on this function would need manual adjustment.
Fortunately, Prometheus’ Query Language (PromQL) is sufficiently powerful to remove the need for manual adjustment.
First a reminder of the definition of European Summer Time, which BST follows:
European Summer Time is the variation of standard clock time that is applied in most European countries … in the period between spring and autumn, during which clocks are advanced by one hour from the time observed in the rest of the year…
European Summer Time begins at 01:00 UTC … on the last Sunday in March and ends at 01:00 UTC … on the last Sunday in October each year
We can follow this definition in PromQL like so:
- record: is_european_summer_time
(vector(1) and (month() > 3 and month() < 10))
(vector(1) and (month() == 3 and (day_of_month() - day_of_week()) >= 25) and absent((day_of_month() >= 25) and (day_of_week() == 0)))
(vector(1) and (month() == 10 and (day_of_month() - day_of_week()) < 25) and absent((day_of_month() >= 25) and (day_of_week() == 0)))
(vector(1) and ((month() == 10 and hour() < 1) or (month() == 3 and hour() > 0)) and ((day_of_month() >= 25) and (day_of_week() == 0)))
This expression works through each of the four tests, returning 1 if the test is true. If it reaches the bottom it returns 0.
A rough verbal description of the expression:
- Is the month after March but before October?
- Is the month March, and the day after the last Sunday?
- Is the month October, and the day before the last Sunday?
- Is the month October, the day the last Sunday, and the time before 01:00? Or is the month March, the day the last Sunday, and the time after or equal to 01:00?
If none of the above are true then it is not European Summer Time.
I’ll note at this point that though this expression has worked well for me it could probably be refactored.
With the difficult bit out of the way, we can then use our new
is_european_summer_time metric to convert the UTC function
time() into a Europe/London version:
- record: europe_london_time
expr: time() + 3600 * is_european_summer_time
We then use this to create a Europe/London version of
- record: europe_london_hour
And finally (for the Prometheus configuration) we create an alert that fires during our desired night period:
- alert: QuietHours
expr: europe_london_hour >= 23 or europe_london_hour <= 6
description: 'This alert fires during quiet hours. It should be blackholed by Alertmanager.'
The final pieces of the solution are on the Alertmanager side.
We prevent the QuietHours alert from ever reaching a real receiver:
- name: blackhole
but still use it to inhibit, and therefore prevent the sending of, warning level alerts:
Though Prometheus + Alertmanager may currently lack proper support for time of day based notifications, PromQL is powerful enough to compensate.
This example showed a simple inhibition based solution designed to work for timezone Europe/London. However, the principles used should transfer to other solutions and timezones.