Monitoring Kubernetes with Prometheus and Grafana on Amazon EKS

5 min readNov 17, 2024

1. Introduction

As organizations increasingly adopt Kubernetes for container orchestration, ensuring the health and performance of Kubernetes clusters is critical. Monitoring and visualization tools are essential for detecting and resolving issues quickly, optimizing performance, and ensuring that clusters are running efficiently. In this blog, we’ll walk through the process of installing Prometheus and Grafana on Amazon EKS (Elastic Kubernetes Service) for effective Kubernetes monitoring.

Prometheus is a powerful open-source monitoring and alerting toolkit designed for reliability and scalability. Grafana, on the other hand, provides rich visualizations for the data collected by Prometheus, giving teams real-time insights into the performance of their Kubernetes clusters.

2. Use Case

A typical use case for integrating Prometheus and Grafana in an EKS environment involves:

Cluster Monitoring: Keeping track of the health of your EKS nodes, pods, and overall infrastructure. Monitoring CPU, memory, disk, and network utilization metrics.
Application Monitoring: Observing how your applications are performing, including tracking application-specific metrics such as response time, request count, and error rates.
Alerting: Setting up alerts based on predefined thresholds to proactively detect issues such as high CPU usage or low disk space.
Visualization: Using Grafana dashboards to visualize the health and performance of your Kubernetes resources with customizable charts and metrics.

This setup helps developers and operations teams take a proactive approach to ensure the reliability of their Kubernetes workloads.

3. Architecture Diagram

Below is a simplified architecture diagram of how Prometheus and Grafana integrate with your EKS cluster for monitoring:

Prometheus collects metrics from various sources (Kubernetes nodes, pods, services) using exporters and stores them in time-series data format.
Grafana connects to Prometheus as a data source and visualizes the collected metrics in user-friendly dashboards.
Alerts are configured in Prometheus to notify users when certain thresholds are breached.

4. Prerequisites

Before you begin the installation process, ensure you have the following prerequisites:

Amazon EKS Cluster: An active EKS cluster set up and running.
kubectl: Command-line tool for interacting with the Kubernetes API (configured to work with your EKS cluster).
Helm: Package manager for Kubernetes applications. If Helm is not installed, follow the installation guide from Helm’s official website.

5. Step-by-Step Installation

Step 1: Add Helm Repositories for Prometheus and Grafana

Add the Helm chart repositories for Prometheus and Grafana to your system:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Step 3: Install Prometheus

Install Prometheus on your EKS cluster using Helm. The kube-prometheus-stack chart will install Prometheus along with related components like Alertmanager, node exporter, and others.

helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace

This command creates the monitoring namespace and installs Prometheus in it.

Step 4: Install Grafana

Next, install Grafana to visualize the metrics collected by Prometheus:

 
$ helm install grafana grafana/grafana --namespace monitoring
NAME: grafana
LAST DEPLOYED: Sun Sep  8 11:19:33 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
1. Get your 'admin' user password by running:

   kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:

   grafana.monitoring.svc.cluster.local

   Get the Grafana URL to visit by running these commands in the same shell:
     export POD_NAME=$(kubectl get pods --namespace monitoring -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=grafana" -o jsonpath="{.items[0].metadata.name}")
     kubectl --namespace monitoring port-forward $POD_NAME 3000

3. Login with the password from step 1 and the username: admin
#################################################################################
######   WARNING: Persistence is disabled!!! You will lose your data when   #####
######            the Grafana pod is terminated.                            #####
#################################################################################
$

#####
$ kubectl get pods -n monitoring
NAME                                                READY   STATUS    RESTARTS   AGE
grafana-85dcc9bc6f-q2lrp                            1/1     Running   0          32s
prometheus-alertmanager-0                           1/1     Running   0          14m
prometheus-kube-state-metrics-74cdb59bff-h5gpb      1/1     Running   0          14m
prometheus-prometheus-node-exporter-6bnv4           1/1     Running   0          14m
prometheus-prometheus-node-exporter-8k9nd           1/1     Running   0          14m
prometheus-prometheus-pushgateway-66fc55f8d-g45bv   1/1     Running   0          14m
prometheus-server-dd484f8d9-m54zm                   2/2     Running   0          14m
$

#### Default grafana creates the ClusterIP service ####
$ kubectl get svc -n monitoring
NAME                                  TYPE           CLUSTER-IP       EXTERNAL-IP                                                                     PORT(S)        AGE
grafana                               ClusterIP      XX.XXX.XXX.XXX   <none>                                                                          80/TCP         67m

#### To deploy Grafana with an external Load Balancer, update the existing service configuration and change its type to LoadBalancer###

$ kubectl edit svc grafana -n monitoring
type: LoadBalancer

$ kubectl get svc -n monitoring|grep grafana
grafana                               LoadBalancer   XX.XXX.XXX.XXX   ae1XXXXXXXX.ap-south-1.elb.amazonaws.com   80:32668/TCP   78m

Step 5: Verify the Installation

Check that both Prometheus and Grafana are installed successfully by listing the pods in the monitoring namespace:

kubectl get pods -n monitoring

Step 6: Access Grafana UI

Now, open your browser and use LoadBalancer DNS name to access the Grafana UI. The default login credentials are:

Username: admin
Password: Use below command to get the initial password for the grafana dashboard

$ kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
9abcXXXXXXXXXXXXXXTVq
$

Grafana Login page looks like below.

Once you logged in you can see the console Home page looks like below

Step 7 : Adding Prometheus Data Source to Grafana

Log in to Grafana.
Navigate to Configuration > Data Sources and click Add data source.
Select Prometheus, enter the URL (e.g., http://prometheus:9090), and click Save & Test.
Create a new dashboard by importing the dashboard and choose Prometheus as the data source to start building queries.

Step 8: Import Prometheus Dashboard to Grafana

Navigate to Grafana: Go to + > Import and either upload the JSON file or paste the JSON configuration for the Prometheus dashboard.
Configure and Import: Enter a name and select the Prometheus data source, then click Import to add the dashboard.

Sample DashBoard looks like below:

Step 9: Set Up Persistent Storage

For production environments, it’s recommended to configure persistent volumes for Prometheus and Grafana to retain metrics and dashboards across restarts.

For Prometheus, you can modify the Helm installation to request persistent storage:

helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=8Gi

6. Conclusion

Congratulations! You’ve successfully set up Prometheus and Grafana on your Amazon EKS cluster. With Prometheus collecting valuable metrics from your Kubernetes infrastructure and Grafana visualizing them, you can now monitor your EKS cluster in real time.

This monitoring setup provides insights into cluster performance, helps with troubleshooting, and enables you to set up alerts for proactive management. As you expand your monitoring setup, consider customizing Grafana dashboards, integrating with other services, and setting up more advanced Prometheus configurations like alerting and scraping additional metrics.