Member-only story
Single-Deployment Observability on Kubernetes: Introducing Robusta for Kubernetes Monitoring
Introduction
I have spent some time in the Observability space. Grafana (LGTM) stack is one of the leading Open-source projects in that space. There are other Cloud contenders such as Datadog, Amazon CloudWatch, and New Relic and others. However, the Grafana stack takes center stage because it has massive contributions and is quite flexible and scalable when it comes to deploying scalable observability systems. These two articles that show how to deploy a scalable metrics engine using Thanos (elder brother to Prometheus and like Mimir) and comparison within Thanos, Mimir, and AMP.
Effective and Ineffective Observability
As much as the Grafana stack is awesome, it is how it is used that determines how much value can be extracted from it. Setting up the Grafana dashboard with all the beautiful colors and graphs is not what developers are looking for. What they need is actionable insights, and metrics for issues when they occur, which leads to quick resolutions of incidents and therefore improves the MTTR (Mean Time to Recover) for all applications being monitored. This effective way of figuring out issues quickly I to call it “effective observability”. There is ineffective observability of setting up tools and graphs, and when there is an incident it becomes a herculean task to spot things and figure out quickly what is going on. Viewing logs, and looking for the exact…