Prometheus Alerting with AlertManager

Sylia CHIBOUB
DevOps Dudes
Published in
5 min readMay 26, 2020
Photos via Pexels

In my previous article , i have covered monitoring using Prometheus along with Grafana. However, monitoring is incomplete without alerting. That’s why in this article i’ll cover the topic of alerting using the AlertManager integrated to Prometheus.

Introduction

AlertManager is a single binary which handles alerts sent by Prometheus server and notifies end user through E-mail, Slack or other tools. In this article, we will only discuss Slack and Mail notifications.

Monitoring helps predict potential problem or notify for current problem in our system and gives detail regarding the problem. Alerting helps notify as soon as the problem occurs and allow teams to identify problem through notifications.

Alerting with Prometheus setup steps are mentioned below:

  1. Setup and configure AlertManager.
  2. Configure the config file on Prometheus so it can talk to the AlertManager.
  3. Define alert rules in Prometheus server configuration.
  4. Define alert mechanism in AlertManager to send alerts via Slack and Mail

Architecture

Here is a basic architecture of Alertmanager with Prometheus.

Alert rules are defined in Prometheus configuration. Prometheus just scrapes (pull) metrics from its client application(the Node Exporter). However, if any alert condition hits, Prometheus pushes it to the AlertManager which manages the alerts through its pipeline of silencing, inhibition, grouping and sending out notifications. Silencing is to mute alerts for a given time. Alerts are checked to match against active silent alerts, if a match is found then no notifications are sent. Inhibition is to suppress notifications for certain alerts if other alerts are already fired. Grouping group alerts of similar nature into a single notification. This helps prevent firing multiple notifications simultaneously to the receivers like Mail or Slack.

AlertManager Installation

This configuration is based on the architecture described below :

The installation of NodeExporter, Prometheus and Grafana have been covered in my previous article .So, here, i’ll only be covering the AlertManager installation and setup process :)

Photo via ClipDealer

First, we need to download the latest binary of AlertManager from here.

AlertManager Configuration

The AlertManager uses a configuration file named alertmanager.ymlThis file is contained in the extracted directory. However, it is not of our use. That’s why we need to create our own alertmanager.yml

Then put the following :

To get more information about how to configure SMTP , check out this article.

Finally, we create the AlertManager systemd service :

Put the following :

Using -web.external-url=http://x.x.x.x:9093 allow the notification URL to be redirected to the prometheus AlertManager web interface. x.x.x.x corresponds to the prometheus server public ip.

Then reload the daemon and start the alertmanager service :

Now you can check : x.x.x.x:9093 and you should get the following :

Perfect ! We just finished configuring the AlertManager.

Photo via Grafana

You can also install the Prometheus Alertmanager Plugin in Grafana. Head to the instance where grafana is installed and install the plugin:

Once the plugin is installed, restart Grafana:

Access to the URL : x.x.x.x:3030 and configure an AlertManager Prometheus datasource.

Then install the dashboard grafana.com/dashboards/8010

You’ll end up having the follwing :

Now, let’s move on to the AlertManager integration to Prometheus :)

Photo via prometheus

AlertManager Integrated to Prometheus

Based on the installation process described in my previous article, you can access Prometheus : x.x.x.x:9090. This is how it looks in a web browser

Now, we need to configure the Prometheus server so it can talk to AlertManager service. We are going to set up an alert rule file which defines all rules needed to trigger an alert.

In the /etc/prometheus/prometheus.yml add the following

Which lead us to this final etc/prometheus/prometheus.yml file :

Prometheus server is going to track incoming time series data, once any of the rules defined in etc/prometheus/alert.rules.yml is satisfied, an alert is triggered to AlertManager service that notifies the client on Slack.

nano /etc/prometheus/alert.rules.yml

The above alert rule checks whether the instance is down. Prometheus trigger an alert if it is down for more than 1 minute. We can check if the alert file is syntactically correct using “promtool” tool.

You can find more interesting alerts in Awesome Prometheus alerts.

Restart services :

Now,if i turn one of the target instances down ( e.g. Target 1 based on the architecture), i’ll receive an alert on my AlertManager and Slack :

Slack Notification
AlertManager Notification x.x.x.x:9093

Prometheus Customized Alerts

Here, we are going to define a set of rules in order to be alerted if the CPU load,Memory or Disk usage exceeds a certain threshold or if any instance of the supervised instances goes down.

Access the etc/prometheus/alert.rules.yml file and put the following :

In the above configuration, we have defined 4 alerts. Each alert has its own rule defined in expr . The expr is made of a query (the left side) and a condition (the right side).

i.e. The HostHighCpuLoad rule checks whatever the percentage of the node exporter’s CPU load is greater then 80%.

In order to supervise the percentage of CPU,Memory and Disk load on your infrastructure you can import this grafana dashboard and link it to a prometheus datasource.

You can also adapt the -web.external-urllink defined previously in /etc/systemd/system/alertmanager.service to-web.external-url=http://x.x.x.x:3000 . In that way, the link of the received alerts on Slack and Mail will redirect us directly to our customized Grafana Dashboard.

You can use these queries if you want to customize your dashboard to get the CPU load , memory and disk usage.

References :

Awesome Prometheus alerts

AlertManager Integration with Prometheus

Prometheus With AlertManager

Install Prometheus Alert Manager

AlertManager

How to check SSL certificate expiration with Alertmanager and Grafana

--

--

Sylia CHIBOUB
DevOps Dudes

Supporting Open Source and Cloud Native as a DevOps Engineer