Create Monitoring System using (Grafana + Prometheus + Node_Exporter / WMI_Exporter) on RedHat Enterprise Linux Server. Pt.1

Jullfiqar
11 min readJan 23, 2024

--

Grafana Dashboard on RedHat Enterprise Linux

Prometheus:

Function: Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability.

It is highly optimized for capturing and querying time-series data.

Prometheus works on a pull-based model, where it regularly scrapes metrics from various targets,

such as applications and services.

Key Features: Time-series Database: Stores time-series data efficiently.
Multi-dimensional Data Model: Allows querying and slicing data using labels.

PromQL: A powerful query language for analyzing and alerting on metrics.

Service Discovery: Can automatically discover and monitor new services.
Alerting: Supports alerting based on defined rules.

Node Exporter:

Function: Node Exporter is a Prometheus exporter for system metrics.

It collects various metrics about the host machine it runs on and makes them available for Prometheus to scrape.

This includes information about CPU usage, memory, disk I/O, network statistics, and more.

Key Features: Exposes System Metrics: Provides an endpoint for Prometheus to collect system-level metrics.

Simple Setup: Lightweight and easy to install on a variety of systems.
Standard Metrics: Offers a standard set of metrics for monitoring basic system health.

Compatible with Prometheus: Integrates seamlessly with Prometheus monitoring.

Grafana:

Function: Grafana is an open-source analytics and monitoring platform that integrates with various data sources,

including Prometheus. It allows users to create interactive and customizable dashboards, visualizing data from different sources.

Grafana is often used in conjunction with Prometheus to create rich and insightful visualizations.

Key Features:
Dashboard Creation: Enables the creation of interactive and visually appealing dashboards.

Data Source Integration: Supports a wide range of data sources, including Prometheus, InfluxDB, Graphite, and more.

Templating: Allows the creation of dynamic dashboards with variable placeholders.

Alerting: Provides alerting features based on defined thresholds.
Plugins and Extensions: Supports a plugin system for extending functionality.

JADI INTINYA :

- GRAFANA: Dashboard Servernya convert dari metrik prometheus berupa teks menjadi sebuah grafik.

- PROMETHEUS:
Prometheus server untukpemantauan dan sistem penyimpanan metrik yang mengumpulkan dan menyimpan data metrik.

Menerima metrik langsung dari target atau ekspor (exporter) yang terhubung ke sumber data yang diinginkan.

Menggunakan PromQL (Prometheus Query Language)

- METRICS DI NODE_EXPORTER / WMI_EXPORTER:
ekspor metrik yang diinstal di node atau mesin yang ingin dimonitor. Mengumpulkan metrik terkait sistem operasi, termasuk penggunaan CPU, penggunaan memori, dan banyak lagi.

Node Exporter menghasilkan metrik dalam format teks yang dapat dibaca oleh Prometheus.

GRAFANA INSTALLATION

Source : https://grafana.com/docs/grafana/latest/setup-grafana/installation/redhat-rhel-fedora/

1.Import the GPG key:

wget -q -O gpg.key https://rpm.grafana.com/gpg.key
sudo rpm - import gpg.key

2.Create grafana.repo

vi /etc/yum.repos.d/grafana.repo
[grafana]
name=grafana
baseurl=https://rpm.grafana.com
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://rpm.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
exclude=*beta*

3.To install Grafana Enterprise, run the following command:

sudo dnf install grafana-enterprise
systemctl start grafana-server.service
systemctl enable grafana-server.service

4.Create Alternative file service

# Alternatively, create a file in /etc/systemd/system/grafana-server.service.d/override.conf
sudo systemctl edit grafana-server.service

[Service]
# Give the CAP_NET_BIND_SERVICE capability
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE
# A private user cannot have process capabilities on the host's user
# namespace and thus CAP_NET_BIND_SERVICE has no effect.
PrivateUsers=false
sudo systemctl restart grafana-server
systemctl status grafana-server.service #verif the status

[root@IDJKTLAB-LINUX ~]# systemctl status grafana-server.service
● grafana-server.service - Grafana instance
Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/grafana-server.service.d
└─override.conf
Active: active (running) since Sun 2024-01-21 10:38:32 WIB; 2 days ago
Docs: http://docs.grafana.org
Main PID: 3745678 (grafana)
Tasks: 25 (limit: 100447)
Memory: 70.7M
CGroup: /system.slice/grafana-server.service
└─3745678 /usr/share/grafana/bin/grafana server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --packaging=rpm cfg:default.paths.logs=>

5.Configure the firewalld

If the firewall active, allow port 3000

firewall-cmd --zone=public --add-port=3000/tcp --permanent

i’m prefere disable the firewall

systemctl stop firewalld
systemctl disable firewalld

6.Verif the Installation:

Access via Browser
http://10.159.42.103:3000

default username & password :
username: admin
password: admin

Restart Linux Machine

shutdown -h now #for Shutdown
shutdown -r #for Restart

PROMETHEUS INSTALLATION

Source : https://prometheus.io/docs/introduction/first_steps/

1.Download Prometheus & Node_Exporter

Go To : https://prometheus.io/download/
right click > copy link > paste on your terminal session

wget https://github.com/prometheus/prometheus/releases/download/v2.49.1/prometheus-2.49.1.linux-amd64.tar.gz
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz

2.Extract the tar file

tar -zxvf prometheus-2.49.1.linux-amd64.tar.gz 
tar -zxvf node_exporter-1.7.0.linux-amd64.tar.gz

3.Verif the extract tar file

[root@IDJKTLAB-LINUX ~]# ll -a
total 105144
dr-xr-x---. 9 root root 4096 Jan 22 23:57 .
dr-xr-xr-x. 17 root root 224 Jul 31 20:33 ..
-rw-------. 1 root root 1375 Jun 22 2023 anaconda-ks.cfg
-rw-------. 1 root root 19246 Jan 22 23:57 .bash_history
-rw-r--r--. 1 root root 18 Aug 13 2018 .bash_logout
-rw-r--r--. 1 root root 176 Aug 13 2018 .bash_profile
-rw-r--r--. 1 root root 176 Aug 13 2018 .bashrc
drwx------. 4 root root 30 Jul 20 2023 .cache
drwx------. 4 root root 32 Jan 16 16:35 .config
-rw-r--r--. 1 root root 100 Aug 13 2018 .cshrc
drwx------. 3 root root 25 Jun 22 2023 .dbus
-rw-r--r--. 1 root root 1820 Jun 22 2023 initial-setup-ks.cfg
drwxr-xr-x. 3 root root 19 Jun 22 2023 .local
drwxr-xr-x. 2 1001 1002 35 Jan 20 07:41 node_exporter-1.7.0.linux-amd64
-rw-r--r--. 1 root root 10419253 Nov 13 07:03 node_exporter-1.7.0.linux-amd64.tar.gz
drwxr-xr-x. 2 1001 127 35 Jan 17 21:24 prometheus-2.49.1.linux-amd64
-rw-r--r--. 1 root root 97181258 Jan 16 00:39 prometheus-2.49.1.linux-amd64.tar.gz
drwx------. 2 root root 25 Jun 22 2023 .ssh
-rw-r--r--. 1 root root 129 Aug 13 2018 .tcshrc
-rw-r--r--. 1 root root 165 Jan 17 21:10 .wget-hsts
-rw-------. 1 root root 82 Jul 25 15:07 .Xauthority
-rw-------. 1 root root 160 Jul 31 20:36 .xauthZjZKEf

4.Create User,Group & Ownership

change the ownership on dir /var/lib/ to prometheus prometheus,
because the default access user for prometheus is prometheus

5.Verif

[root@IDJKTLAB-LINUX system]# ll -a /var/lib/ | grep prometheus
drwxr-xr-x. 2 root root 6 Jan 17 21:19 prometheus

# still root root change to user prometheus group prometheus

6.Change the ownership

groupadd - system prometheus
useradd - system -s /sbin/nologin -g prometheus prometheus
chwon -R prometheus:prometheus /var/lib/prometheus/
# ini user:ini grupnya ini dir yang mau dirubah ownershipnya
# biar ownership di /var/lib/prometheus/ jadi prometheus prometheus

7.Move some dir & file

  • need to move the prometheus & promtool to /usr/local/bin/
  • need to move dir console_libraries/ to /etc/prometheus/
  • need to move dir consoles/ to /etc/prometheus/
  • need to move .yml file prometheus.yml to /etc/prometheus/
cd prometheus-2.49.1.linux-amd64/
drwxr-xr-x. 4 1001 127 132 Jan 16 00:35 .
dr-xr-x - -. 9 root root 4096 Jan 17 21:11 ..
drwxr-xr-x. 2 1001 127 38 Jan 16 00:32 console_libraries
drwxr-xr-x. 2 1001 127 173 Jan 16 00:32 consoles
-rw-r - r - . 1 1001 127 11357 Jan 16 00:32 LICENSE
-rw-r - r - . 1 1001 127 3773 Jan 16 00:32 NOTICE
-rwxr-xr-x. 1 1001 127 126534746 Jan 16 00:00 prometheus
-rw-r - r - . 1 1001 127 934 Jan 16 00:32 prometheus.yml
-rwxr-xr-x. 1 1001 127 120211598 Jan 16 00:00 promtool

8.Create Directory for Config Store

mkdir /etc/prometheus/ # Create Directory untuk penyimpanan config biasanya emang di etc -> untuk config

9.Move the Binary prometheus & promtool

mv promtool prometheus /usr/local/bin # Pindahin binary prometheus&promtool ke /usr/local/bin/ dir

10. Move from /prometheus-2.49.1.linux-amd64/ to /etc/prometheus/

Move the Directory console_libraries/
Move the Directory consoles/
Move the .yml file

mv console_libraries/ consoles/ prometheus.yml /etc/prometheus/
# pindahin directory console_libraries/ consoles/ dan file prometheus.yml ke directory /etc/prometheus/

9.Create Directory for Data Log Store

mkdir /var/lib/prometheus/ # Create Directory untuk penyimpanan data biasanya di /var/lib -> untuk database

10.Configure prometheus.yml

vi /etc/prometheus/prometheus.yml
# my global config
global:
scrape_interval: 15s #Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s #Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
# alerting:
# alertmanagers:
# - static_configs:
# - targets:
# # - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
-job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]

10.1.Comment yang engga perlu, intinya sebenernya cuma line dibawah ini:

global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]

11.Create services/system daemon on Linux

default store on /etc/systemd/system/

vi /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/docs/introduction/overview/
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
-config.file /etc/prometheus/prometheus.yml \
-storage.tsdb.path /var/lib/prometheus/ \
-web.console.templates=/etc/prometheus/consoles \
-web.console.libraries=/etc/prometheus/console_libraries

[Install]
WantedBy=multi-user.target

After you create/config services/system daemon, you need to reload the system daemon

systemctl daemon-reload

12.Start the prometheus services

systemctl enable --now prometheus.service 
# enable --now : enable saat boot dan start service now


# CHECK STATUS
systemctl status prometheus.service

[root@IDJKTLAB-LINUX ~]# systemctl status prometheus.service
● prometheus.service - Prometheus
Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2024-01-21 10:38:39 WIB; 2 days ago
Docs: https://prometheus.io/docs/introduction/overview/
Main PID: 3745721 (prometheus)
Tasks: 36 (limit: 100447)
Memory: 351.5M
CGroup: /system.slice/prometheus.service
└─3745721 /usr/local/bin/prometheus --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /var/lib/prometheus/ --web.console.templates=/etc/prometheus/conso

Verify The Installation:

Direct Access to http://10.159.42.103:9090 #port 9090 based on config .yml
or
Direct Access to http://localhost:9090 #port 9090 based on config .yml

Verify the Installation metrics : http://10.159.42.103:9090/metrics
or
Verify the Installation metrics : http://localhost:9090/metrics

Prometheus Dashboard GUI

NODE_EXPORTER (LINUX) INSTALLATION

 cd node_exporter-1.7.0.linux-amd64/
[root@IDJKTLAB-LINUX node_exporter-1.7.0.linux-amd64]# ll -a
total 19480
drwxr-xr-x. 2 1001 1002 56 Nov 13 07:03 .
dr-xr-x - -. 9 root root 4096 Jan 17 21:11 ..
-rw-r - r - . 1 1001 1002 11357 Nov 13 07:02 LICENSE
-rwxr-xr-x. 1 1001 1002 19925095 Nov 13 06:54 node_exporter
-rw-r - r - . 1 1001 1002 463 Nov 13 07:02 NOTICE

executable file node_exporter sama kaya prometheus bisa dieksekusi langsung cuma kita mau running dibackground process makanya dibikin services lagi (systemd)

(Executable file node_exporter is the same as Prometheus, it can be executed directly, but we want it to run in the background process, so we create another service (systemd))

1.1Create Services/Systemd node_exporter

1.Move again node_exporter executable file to /usr/local/bin like prometheus before

mv node_exporter /usr/local/bin/ 
#same like prometheus

2.Create Services / System Daemon on
/etc/systemd/system/node-exporter.service

vi /etc/systemd/system/node-exporter.service
[Unit]
Description=Prometheus exporter for machine metrics

[Service]
Restart=always
User=prometheus
ExecStart=/usr/local/bin/node_exporter
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no

[Install]
WantedBy=multi-user.target

Verify
lsof -n -i | grep node
Port Default Node_exporter 9100

10.159.42.103:9100 (defaultnya port 9100)
untuk ngeliat metrics dari sisi client like (cpu,diskusage, etc)

3. Go To Prometheus Server config .yml

Now, go to Prometheus Server again for add node on .yml file

# Prometheus Server

vi/etc/prometheus/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
# alerting:
# alertmanagers:
# - static_configs:
# - targets:
# # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

static_configs:
- targets: ["localhost:9090"]
- job_name: "node-exporter"
static_configs:
- targets: ["localhost:9100"]
- job_name: "IDJKTVDI1010"
static_configs:
- targets: ["10.159.42.100:9100"]
- job_name: "IDJKTLIC01"
static_configs:
- targets: ["10.159.40.161:9182"]

# EVERYTIME WE ADD NODE FOR MONITORING WINDOWS OR LINUX ALWAYS MAPPING THE NODE ON
# .YML FILE

# for example
# - job_name: "IDJKTVDI1010"
# static_configs:
# - targets: ["10.159.42.100:9100"]

STEP DIATAS CONTOH KITA AMBIL METRICS DI SERVER YANG SAMA MENGGUNAKAN NODE_EXPORTER

(THE STEPS ABOVE, WE TAKE METRICS ON THE SAME SERVER USING NODE_EXPORTER)

GIMANA KALO MULTISERVER (AMBIL METRICS DI SERVER YANG BERBEDA UNTUK DITAMPILIN?)

(WHAT ABOUT MULTI SERVER (TAKE METRICS ON DIFFERENT SERVERS TO DISPLAY?))

CLIENT HARUS PUNYA NODE_EXPORTER/WMI_EXPORTER

(CLIENT MUST HAVE A NODE_EXPORTER/WMI_EXPORTER)

CLIENT SIDE

SAMA HALNYA DENGAN DI SERVER KITA JADIIN SERVICE SI NODE_EXPORTER INI KARENA DIA RUNNING DIBACKGROUND PROCESS

1. CREATE USER & GROUP YANG SAMA DENGAN SERVER KARENA KITA SEMPET BIKIN USER & GROUP prometheus

groupadd - system prometheus
useradd - system -s /sbin/nologin -g prometheus prometheus

2. CREATE SERVICE SYSTEM DAEMON (systemd)

vi /etc/systemd/system/node-exporter.service
[Unit]
Description=Prometheus exporter for machine metrics

[Service]
Restart=always
User=prometheus
Group=prometheus
ExecStart=/usr/local/bin/node_exporter
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no

[Install]
WantedBy=multi-user.target
:wq!
systemctl daemon-reload
systemctl enable - now node-exporter.service

SERVER PROMETHEUS SIDE

vi /etc/prometheus/promethus.yml
# my globalCLIENT SIDE config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
# alerting:
# alertmanagers:
# - static_configs:
# - targets:
# # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

static_configs:
- targets: ["localhost:9090"]
- job_name: "node-exporter"
static_configs:
- targets: ["localhost:9100"]
- job_name: "IDJKTVDI1010"
static_configs:
- targets: ["10.159.42.100:9100"]
- job_name: "IDJKTLIC01"
static_configs:
- targets: ["10.159.40.161:9182"]

# TAMBAHIN IP CLIENT CUKUP TAMBAHIN 3 LINE job_name:,static_configs:,targets:

NOTE: ALWAYS RESTART THE NODE_EXPORTER SERVICES AFTER CHANGING THE .YML FILE

systemctl restart node_exporter.service 
#both machine

Verify on Prometheus Web GUI:

http://10.159.42.103:9090/

go to Status > Targets

Verify access to the metrics

http://10.159.42.103:9090/metrics

INSTALL NODE_EXPORTER ON WINDOWS (WMI_EXPORTER)

Code > Local > Download ZIP for the whole file/folder program

go to release page link

Download & Install > windows_exporter-0.25.1-amd64.msi

wait while installation begin
Start / Restart Services windows_exporter

Verify

http://localhost:9182 (port default on windows 9182)

Go To Prometheus Server again, for adding node to .yml file

### SERVER SIDE

vi /etc/prometheus/prometheus.yml


# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
# alerting:
# alertmanagers:
# - static_configs:
# - targets:
# # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

static_configs:
- targets: ["localhost:9090"]
- job_name: "node-exporter"
static_configs:
- targets: ["localhost:9100"]
- job_name: "IDJKTVDI1010"
static_configs:
- targets: ["10.159.42.100:9100"]
- job_name: "IDJKTLIC01"
static_configs:
- targets: ["10.159.40.161:9182"]

# EVERYTIME WE ADD NODE FOR MONITORING WINDOWS OR LINUX ALWAYS MAPPING THE NODE ON
# .YML FILE

# for example
# - job_name: "IDJKTVDI1010"
# static_configs:
# - targets: ["10.159.42.100:9100"]

NOTE! = RESTART SERVICES ON BOTH MACHINE

on Windows:
Services > Windows Exporter > Restart # Node

on Linux:

systemctl restart node_exporter.service
systemctl restart prometheus.service

BONUS

DOWNLOAD TEMPLATE DASHBOARD GRAFANA :

https://grafana.com/grafana/dashboards/

--

--

Jullfiqar

Fadhil Dzulfiqar - Passionate about staying up-to-date with the latest technologies & trends in this field, continuously seek to expand my knowledge & skills 💻