Bozobooks.com: Fullstack k8s application blog series

Observability: Log Aggregation with Loki & Grafana Alerts Integration with Slack

Chapter 10: Configuring Grafana Loki to aggregate all the logs across the Kubernetes cluster and set Grafana alerts to send notifications on Slack

Published in

Cloud Native Daily

8 min readMay 27, 2023

Hey everyone, it’s been more than six months since I got some peaceful time for myself to get back to my series. I have been getting a lot of great feedback and comments from people who have been reading my blogs. Thanks for all your feedback, that is what motivated me to finish this series. I’ve been super busy exploring (or lost??:-D) the new AI world — GenAI, Neural Networks, Transformers… reskilling myself… :-D

In this blog, we will continue our journey in implementing the second key pillar of observability, “Logging”. We will be implementing the logging with Loki, and integrating that with our Slack channel, which we had created in Chapter 9, Observability Metrics: Prometheus, Grafana, Alert Manager & Slack Notifications.

Log management is a crucial aspect of any application or system monitoring strategy. Grafana Loki is an open-source log aggregation system that offers an efficient and cost-effective solution. With its distributed architecture and innovative log stream compression technique, Loki tackles the challenges of high-volume log data storage and retrieval. Seamlessly integrated with Grafana, Loki empowers users to visualize and explore their log data with ease. Whether you’re troubleshooting issues, tracking performance, or gaining insights, Grafana Loki proves to be a valuable tool for efficient log aggregation and analysis.

Here are the steps on how to implement logging with Loki and integrate it with our Slack channel:

Install Loki.
Configure Loki.
Create a Slack channel.
Integrate Loki with Slack.

Once you have completed these steps, you will have successfully implemented log aggregation with Loki and integrated it with your Slack channel. You can now start collecting and visualizing your log data, and troubleshooting issues more effectively.

Let's start with installing and setting up the stack

Step 1: Install Grafana Loki

Let's add the Grafana repo (if not added already, we have already done this in Chapter 9), and run a repo update

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Let's set up the values for a simple deployment on the local machine. (for production deployment, we will be using a different configuration, which will be a scalable deployment. Refer to Grafana documentation for more details on various types of deployments).

The following code shows the values.yaml configuration for simple deployment, followed by installing the helm chart.

global:
  namespace: monitoring

loki:
  enabled: true
  persistence:
    enabled: true
    accessModes:
      - ReadWriteOnce
    size: 10Gi
    annotations: {}

promtail:
  enabled: true
  config:
    lokiAddress: http://loki-loki-distributed-gateway/loki/api/v1/push

helm install loki grafana/loki-stack -n loki - create-namespace - values=./values/loki-values.yaml

Here is the screenshot of the output

We can now check if all the pods and services are up and running

kubectl get pods -n loki
kubeclt get svc -n loki

Before we proceed further, make sure that the Loki pods are all running, without any errors. Let's now configure Grafana to show the Loki logs.

Step 2: Configure Loki in Grafana.

We should be able to add Loki as a data source in Grafana. The following screenshot shows the typical configuration. Since I have installed Loki in Loki namespace the URL to access the service is http://loki.loki.svc.cluster.local:3100. We can save and test.

We can now create a panel in the dashboard. The following screenshot shows the typical query to check if the Loki is working fine.

The following screenshot shows a query to look at the trending rate, with the query

rate({namespace="bozo-book-library-dev"} |~ `(?i)error` [1m])

We will be using this query to also generate alerts. The following screenshot shows the number of errors (I have generated some error conditions, by adding the same book, again and again into the library, which generates the error log).

Step 3: Configure Grafana Alerts

Before we get notifications on our Slack channel, we need to set up alerts and configure the rule for any alerts to be fired. To configure the alerts, click on the alert tab, and you will see 3 sections by default

Section A: Configure the rule in the form of LogQL. In our case will be providing a query to check the rate of errors at the last 1 minute.

rate({namespace="bozo-book-library-dev"} |~ `(?i)error` [1m])

Section B: Helps configure the condition. In our case, we will be using classic conditions to look evaluate if the average error rate goes beyond 2 times
Section C: helps in configuring the threshold, which we will not be set right now.

The below screenshot shows the configuration, I used

We can preview to see if the alert fires (for this to fire, once again, I generated error logs, by adding books to the library, that are already there). You can see the below screenshot, that the alert is fired (as we have more than 2 errors at last 1 minute)

Now we have the alert rules configured, and we have tested if the alert gets fired when the condition is satisfied. Now in the next step, we will configure the Slack contact point.

Step 4: Setup Slack contact point

Grafana allows us to configure various types of contact points, to publish alert notifications (including emails, various chat systems, alert-manager, etc). To configure the contact point go to Alerting -> Contact points and select the “+ Add Contact Point” button.

In our case, let's select Slack, and provide the webhook, that we had configured in Chapter 9. The following screenshot shows the configuration

Let's test if the contact point works, by clicking the “Test” button. We should see the alert notification on our Slack channel. The below screenshot shows the test message on my Slack channel.

Now that we have the alerts configured and the slack contact point configured, we need to create a notification policy to notify on our slack contact point, when an alert is fired. To do that, we need to go back to the alert configuration and provide a “Notification” name-value pair, that will be used in the notification policy to match. Here is the screenshot of the name-value pair configuration, I used

We will be using this name-value label to identify the alerts, that go to our Slack contact point in the next step.

Step 5: Set Notification Policy

To configure the Notification policy, go to the “Alerting->Notification Policies” menu option. Select the “New Nested Policy” button, and provide the matching level and the contact point. In our case, we will provide the name-value label we created in the Alert configuration and select our Slack contact point. The following screenshot shows my configuration.

You should now start seeing the alerts notified in the Slack channel. The following screenshot shows my Slack notifications.

As you can see the alert does not have a lot of details, properly displayed such as value, the links to the Grafana dashboard, etc. In the next step, we will be fixing that.

Step 6: Passing the exact alert values, and URLs to the Slack notifications

To set up specific values, we can use the annotations, and use {{$}} to be included in the description. We can add these annotations in the Alert Rule configuration. Here is a screenshot of what I have configured to capture more details about the error

refer to the Grafana documentation for more details on what annotations can we use and how can we parameterize our custom values in custom annotations

That's the quick one, on how to get Loki working, visualize on the Grafana dashboard, and set up alerts and notifications to Slack.

In the next chapter, we will move to distributed tracing, which is another critical observability

I hope this was useful, please leave your feedback and comments

Take care…have fun ;-)

API observability: Leveraging OTel to improve developer experience

APIs provide a way to simplify development, reduce costs, and create more flexible and scalable applications. Much of…

gethelios.dev

Observability - for your test runs too

"Cloud native" - working in distributed systems using microservices and DevOps - has promised a lot, and indeed…

gethelios.dev

Observability in software development - doing more with our data

Observability is a term that has been thrown around a lot in the past few years in the software development industry…

gethelios.dev

Microservices Monitoring: Cutting Engineering Costs and Saving Time

Today, many businesses are adopting a more conservative mindset when it comes to their resources. Cloud-native and…

gethelios.dev

Testing Microservices - Trace Based Integration Testing Example

As engineering organizations transitioned from monolith to microservices architectures, they sought to make their…

gethelios.dev

Kubernetes Monitoring with OpenTelemetry

Organizations increasingly deploy and manage their applications using Kubernetes, which has emerged as the de facto…

gethelios.dev

OpenTelemetry Tracing: Everything you need to know

OTel distributed tracing capabilities compensate for traditional observability methods, that master monolith apps but…

gethelios.dev

Distributed Tracing: A Guide for 2023

Explore the basics of distributed tracing, how it works, the major components, key benefits, challenges, and best…

medium.com

Top 11 Tools for Microservices Backend Development in 2023

python.plainenglish.io

7 Best Tracing Tools for Microservices

Decide The Best Tracing Tools For Your Microservices Architecture

medium.com

9 Best Distributed Tracing Tools for Developers

Decide Among The Best Tools For Distributing Tracing in your Backend Microservices Architecture

javascript.plainenglish.io

Bozobooks.com: Fullstack k8s application blog series

Observability: Log Aggregation with Loki & Grafana Alerts Integration with Slack

Chapter 10: Configuring Grafana Loki to aggregate all the logs across the Kubernetes cluster and set Grafana alerts to send notifications on Slack

Step 1: Install Grafana Loki

Step 2: Configure Loki in Grafana.

Step 3: Configure Grafana Alerts

Step 4: Setup Slack contact point

Step 5: Set Notification Policy

Step 6: Passing the exact alert values, and URLs to the Slack notifications

Further Reading:

API observability: Leveraging OTel to improve developer experience

APIs provide a way to simplify development, reduce costs, and create more flexible and scalable applications. Much of…

Observability - for your test runs too

"Cloud native" - working in distributed systems using microservices and DevOps - has promised a lot, and indeed…

Observability in software development - doing more with our data

Observability is a term that has been thrown around a lot in the past few years in the software development industry…

Microservices Monitoring: Cutting Engineering Costs and Saving Time

Today, many businesses are adopting a more conservative mindset when it comes to their resources. Cloud-native and…

Testing Microservices - Trace Based Integration Testing Example

As engineering organizations transitioned from monolith to microservices architectures, they sought to make their…

Kubernetes Monitoring with OpenTelemetry

Organizations increasingly deploy and manage their applications using Kubernetes, which has emerged as the de facto…

OpenTelemetry Tracing: Everything you need to know

OTel distributed tracing capabilities compensate for traditional observability methods, that master monolith apps but…

Distributed Tracing: A Guide for 2023

Explore the basics of distributed tracing, how it works, the major components, key benefits, challenges, and best…

Top 11 Tools for Microservices Backend Development in 2023

7 Best Tracing Tools for Microservices

Decide The Best Tracing Tools For Your Microservices Architecture

9 Best Distributed Tracing Tools for Developers

Decide Among The Best Tools For Distributing Tracing in your Backend Microservices Architecture

Published in Cloud Native Daily

Written by A B Vijay Kumar

No responses yet