Step-by-Step to a Seamless Monitoring Setup: Prometheus and Grafana: Part 2

Mathesh M E
10 min readJul 10, 2024

--

This article is a continuation of my previous article on Setting up Prometheus. If you didn’t read that first part, I would recommend you to read that first. In this article, we will see how to set up a alerting system in Prometheus for sending alert notifications to Slack.

Also I have one tip, If you are trying all these things for just learning purpose. I would recommend you to use Local Host Machine instead of using AWS EC2 instance. Because, AWS EC2 instance is not free and you will be charged for using it. So, if you are just learning, use Local Host Machine.

Actually in my previous article, I used t2.large instance type for setting up Prometheus and Grafana. But, still I faced some issues like server crash sometimes I don't know why. So, For this article I switched to my Local Host Machine and I don't have any issues now. So, I would recommend you to use Local Host Machine for learning purpose. If you have purpose other than learning, Please have your own choice of using Cloud Providers or Linux Machines.

Now, We already know how to set up Prometheus and Grafana and How to create a dashboard in Grafana. Now, Let’s see how to set up a Alerting System in Prometheus for sending Alerts to Slack.

I will attach the installation guide for Prometheus and Grafana in Windows and Linux Machines at the end of this article. You can follow that guide to set up Prometheus and Grafana in your Local Host Machine.

Before setting up Alert Manager, Let’s see why we need Alerts in Prometheus? and what is Alertmanager in Prometheus?

Why alerts are needed?

Let’s consider we have a website running on a server. We configured prometheus to monitor the website. Prometheus is collecting the metrics like CPU, Memory, Disk, Network, I/O etc. Now let’s say our website is down. We don’t know that our website is down. And after some our customers are calling us and saying that they are not able to access the website. Now It’s not a ideal approach. Because if our customers are not satisfied with our service, they will move to another service. So we need to know that our website is down before our customers know. In this case we need to setup some alerts. So that we will get notified when our website is down.

Why we need Alertmanager?

Normally if we define any alerts in Prometheus it will only raised in Prometheus Web UI. So to get our alerts as notifications we need to use AlertManager. Alertmanager will convert the alerts from Prometheus to the notification format and send alerts to the notification channels like Email, Slack, PagerDuty etc.

What is Alertmanager?

Alertmanager is a component of Prometheus that convert alerts from Prometheus to notification format and send the alerts to the notification channels like Email, Slack, PagerDuty etc. Alertmanager comes with Web UI to manage the alerts. By default Alertmanager will expose on port 9093. If we want to configure Alertmanager behavior we have to configure the Alertmanager behavior in the alertmanager.yml file.

Now, Let’s see how to set up Alertmanager.

Steps:

Step 1: Setting up Alertmanager

Setting up Alertmanager in Linux Machine for Development Purpose

For Linux(Ubuntu):

  • Get the alert package from the official site. Copy the link of the tar.gz file and use the command wget to download the file.
wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz
  • Extract the downloaded file.
tar -xvf alertmanager-0.27.0.linux-amd64.tar.gz
  • Change the directory to the extracted directory.
cd alertmanager-0.27.0.linux-amd64
  • Execute the alertmanager binary file.
./alertmanager # Use "--web.listen-address" flag with the port number. Example: ./alertmanager --web.listen-address=":9094"

Setting up AlertManager as a Service in Linux Machine for Production Purpose:

If you are setting up Alertmanager for some important purpose other than learning or testing, I would recommend you to set up Alertmanager as a service. So that, If the server restarts, Alertmanager will start automatically. Also if we just execute the alertmanager binary file, It will get terminated once we close the terminal. So, It's better to set up Alertmanager as a service.

So instead of executing the alertmanager binary file, You can follow the below steps to set up Alertmanager as a service.

  • To make ur alertmanager installation clean, We can create one Separate directory for alertmanager and move the extracted files to that directory.
sudo mkdir /var/lib/alertmanager
sudo mv alertmanager-0.27.0.linux-amd64/* /var/lib/alertmanager
  • Go to the alertmanager directory.
cd /var/lib/alertmanager
  • Now grant our prometheus user the ownership and access for the alertmanager directory.
sudo chown -R prometheus:prometheus /var/lib/alertmanager
sudo chown -R prometheus:prometheus /var/lib/alertmanager/*
sudo chmod -R 775 /var/lib/alertmanager
sudo chmod -R 775 /var/lib/alertmanager/*
  • Now we can execute alertmanager binary file if we want. But if we are starting our Alertmanager as Process it will get terminated once we close the terminal. So we have to start the Alertmanager as a service. We can create a service file for the Alertmanager and start the Alertmanager as a service.
  • We need to create one storage directory for Alertmanager. We can create one directory called data in the /var/lib/alertmanager directory.
sudo mkdir /var/lib/alertmanager/data
  • Create a service file for Alertmanager.
sudo vi /etc/systemd/system/alertmanager.service
# Add the below content to the file
[Unit]
Description=Prometheus Alertmanager
Documentation=https://prometheus.io/docs/introduction/overview/
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=prometheus
Group=prometheus
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/var/lib/alertmanager/alertmanager --storage.path="/var/lib/alertmanager/data" --config.file="/var/lib/alertmanager/alertmanager.yml"
SyslogIdentifier=prometheus_alert_manager
Restart=always
[Install]
WantedBy=multi-user.target
  • Reload the systemd daemon and start the Alertmanager service.
sudo systemctl daemon-reload
sudo systemctl start alertmanager
sudo systemctl enable alertmanager
  • To check the status of Alertmanager service.
sudo systemctl status alertmanager

Now, We can access the Alertmanager Web UI by using the URL http://localhost:9093. You can see the Alertmanager Web UI.

In case, If you want to set up Alertmanager using Docker then you can use bitnami/alertmanager Image.

Step 2: Configuring Alerts

Now, We have set up the Alertmanager. Now, Let’s see how to configure the alerts in Prometheus.

Since we already have one target in Prometheus, We are now going to set up a simple alert for that target.

  • Create a folder called rules in the location where the prometheus.yml file is located. Let's say I have my prometheus.yml file in the location /etc/prometheus/. So I will create a folder called rules in the /etc/prometheus/ location. And then I will create a file called alert.yaml in the rules folder.
sudo mkdir /etc/prometheus/rules
sudo vi /etc/prometheus/rules/alert.yaml
  • Copy the below content to the alert.yaml file.
- alert: PrometheusTargetMissing
expr: up == 0
for: 0m
labels:
severity: critical
annotations:
summary: Prometheus target missing (instance {{ $labels.instance }})
description: "A Prometheus target has disappeared. An exporter might be crashed.\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"

Alert Purpose: What the above alert does is, It will check whether our targets is up or not. If the target is down, It will send an alert to the Alertmanager with the details of which target is down.

  • Now, We have to include the alert.yaml file location in the prometheus.yml file. So that Prometheus will read alert.yaml file and it will start sending the alerts to the Alertmanager.
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration, Prometheus will use this to send alerts to Alertmanager.
alerting:
alertmanagers:
- static_configs:
- targets:
- 'localhost:9093'
# Location of the alert rules.
rule_files:
- "rules/alerts.yaml"
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: 'application-server'
static_configs:
- targets: ['3.90.108.255:9100']
  • Now, We have to restart the Prometheus service to apply the changes. If you are running as a docker container, You can restart the container using docker restart container_id. If you are running as a service, You can restart the service.
sudo systemctl restart prometheus
  • Now, We have to check whether the alerts are working or not. We can check the alerts in the Alertmanager Web UI. You can see the alerts in the Alertmanager Web UI.

Step 3: Sending Alerts to Slack

Now, We have set up the Alertmanager and to send the alerts to the Slack, We have to configure the Slack in the Alertmanager. Let’s see how to configure the Slack in the Alertmanager.

  • Now we have to update the alertmanager.yml file to send the alerts to the Slack. You can find the alertmanager.yml file in the location of the extracted alertmanager installation directory.
global:
slack_api_url: 'https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX'
route:
# The below receiver is a Default receiver. If the alert doesn't match any of the receivers in routes section, It will send the alert to the default receiver.
receiver: 'slack-notifications'
# To send alerts to different receivers based on different conditions, We can use the "routes" section.
# routes:
# - match:
# severity: critical
# receiver: 'slack-notifications'
# - match:
# severity: warning
# receiver: 'slack-notifications'

receivers:
- name: 'slack-notifications'
slack_configs:
- send_resolved: true
channel: '#channel_name'
icon_emjoi: ':warning:'
  • I had only one receiver called slack-notifications because I had only one slack channel. If you have multiple slack channels you can create multiple receivers based on your requirements.
  • Now you can see in the above configuration, We have to provide the slack_api_url. You can get the slack_api_url by creating a Slack App. You can create a Slack App by following the below steps.
  • Go to your Slack Channel
  • You will get the Webhook URL. Copy the Webhook URL and paste it in the alertmanager.yml file as a value for slack_api_url: parameter.
  • Now, We have to restart the Alertmanager service to apply the changes. If you are running alertmanager as a docker container, You can restart the container using docker restart container_id. If you are just executing the alertmanager binary file, You can restart by closing the terminal and executing the alertmanager binary file again or If you are running alertmanager as a service just as described in the installation part, You can restart the service.
sudo systemctl restart alertmanager
  • Now, If you navigate to alerts tab in the Prometheus Web UI, You can see the alert we defined in the alerts.yaml file. It will be in the inactive state with green color. If the target is down, It will become active and it will send the alert to the Alertmanager.

Testing the Alerts

Let’s test the alert by stopping the target. Since my Node Exporter is running on the AWS EC2 instance as a executable file, I will stop the Node Exporter by killing the process. You can stop the target by stopping the service or by killing the process.

Now navigate to the Prometheus Web UI and click on the Status option. You can see the Targets option. Click on the Targets option. You can see the Node Exporter target. You can also check the alerts in the alerts tab.

Navigate to the Alertmanager Web UI and you can see the alerts in the Alertmanager Web UI.

Now, You can see the alerts in the Slack channel as well.

That’s all about setting up a Alerting System in Prometheus for sending Alerts to Slack. If you have any doubts, Please let me know in the comments.

The 3rd part of this topic for securing Prometheus server using basic authentication and SSL will be released tomorrow or day after tomorrow.

Installation Guides for Prometheus, Grafana and all other Prometheus Components for Windows and Linux can be found in this GitHub Repository: Click here

Configuration files Repository Link: Click here

👏 If you find this helpful, don’t forget to give claps and follow my profile. I’ll be sharing more projects and ideas about Cloud and DevOps. If you have any doubts, feel free to comment or message me on LinkedIn.

Let’s Connect on LinkedIn: LinkedIn Profile

Explore Hands-On Projects: My GitHub Account

--

--