Loki: From Jsonnet to Helm

Amit Ben Ami
Soluto by asurion
Published in
4 min readApr 8, 2021

Since we started using Loki as our log aggregation system at Soluto, we had a major drawback using its Jsonnet deployment. Each time we needed to read the Jsonnet code, trying to understand it and its configuration, it took us forever while causing frustration.
When we saw that Loki Helm Chart is publicly available, we knew we have to migrate to ease the pain and complexity of configuring and upgrading our precious Loki.

In this article, I will cover how to properly use Loki with the new Helm Chart, ready for production use.

Why Loki

Our production environment consists of tens to hundreds of microservices on top of 2 Kubernetes clusters (for High Availability). For that instance, we needed a way to aggregate all of our logs into a single place, so our log monitoring will be easy to manage.

There are many log aggregation systems, and for a long time, we’ve been using ELK by Logz.io.
ELK is great, but its abilities come with a price — at a large scale, it becomes complex to maintain and expensive. But sometimes, developers simply need a centralized place to look for recent logs.
For that, Loki is perfect while keeping the overhead to a minimum.

Our Loki

We chose to deploy Loki’s different components as separate microservices. We prefer it this way so we can have better control over each component separately (There’s a different mode — Monolithic Mode, which deploys Loki’s components as a single running container).

We had Loki deployed using Jsonnet, which was the only maintained option at the time we started working with it. This brought up complexities and difficulties in maintaining and configuring it properly.

We needed a better way for deploying and managing our Loki deployment, and then the Official Helm Chart was released.

Evolving our Workflow

We usually use Helm for our services and prefer it over Jsonnet, so the impact and experience were immediate. After some digging into the new Chart and its values, we decreased the number of configuration files needed to deploy Loki: only 3 helm configuration files against 9 Jsonnet configuration files and far fewer lines of code — Amazing!

You can check out the full example in my example repository.

We chose to work with Promtail as Loki’s client log shipper, but you can choose whatever client you want from the list.

As you can see, I also specified a whole config since the Chart doesn’t have a simple way of injecting configuration values in specific areas, but rather overriding the whole config.
You can see the full configuration reference in the documentation.

Loki with GitOps

In the past year, we’ve started using GitOps with ArgoCD immensely, which gives us great visibility and control over our deployments and resources. After finishing the migration from Jsonnet on Loki, we deployed it, and it looks awesome.

Loki resources in Production
Loki resources in Production

Loki-mixins

Now that Loki is deployed, we needed a way to monitor it properly and have dashboards. That is where the Loki-mixins came in handy.
Since the mixins are only available in Jsonnet and still aren’t supported in Helm (not sure they will ever be), we decided to keep the mixins as Jsonnet deployments alongside the Loki Helm deployment itself, to enjoy the official dashboards without extra work.

Creating Loki logs storage automatically

Since we are deploying our resources on AWS, we use DynamoDB and S3 buckets as our Loki backing stores (index and chunks).
Loki doesn’t create the buckets automatically (like it does with DynamoDB using its TableManager component). We needed a way to always be sure that the specified S3 bucket is created before Loki starts to scrape logs.

For this purpose, using Helm Hooks to create our bucket seems the right way to achieve it, using a pre-install hook every time we deploy Loki.

The code that creates the S3 bucket. Can also be seen in the example repository and see the Kubernetes Job the runs it

Summary

Loki is a great log aggregation system, but its simplicity and versatility, especially over Kubernetes services, was the most important part for us, comparing to other solutions.
The use of Helm over Jsonnet lowered the bar drastically in terms of simplicity and maintenance, especially when we chose to host it on our clusters, instead of an external service.

Next Steps

We still have a way to go to improve our Loki deployment and our developers experience using it, such as:

  • Federation with Loki — the ability to query logs from multiple clusters at once
  • Improving our Loki performance for better query experience
  • Alerts based on logs from Loki

Once we will achieve these goals, we will have a great logging solution for monitoring and watching our systems and services logs.

Happy logging! 😊

--

--