Observe what you must!

Manisha Agrawal
2 min readSep 27, 2023

--

For the annual preventive health checkup this year, I chose a package that offered maximum number of tests with an intention to know everything there is to know about how I have been treating myself. Results came out, I began reading the clinical advise and could sense that there are symptoms of something severe brewing up. Scared to the bone, I booked an appointment with my physician for consultation and treatment. I was awaiting a pat on the back for the due diligence in identifying the problem at an early stage. To my horror, he scolded me left and right !

As I was still wondering, he advised that the package contains a lot of tests that are not suitable for my age and I should not have opted for them. My fear was thus baseless. Also, the flood of details confunded the actual indicative parameters in my reports. There was too much to process, and the ones of real value had to be mined.

Now, why am I ranting about this in a technical forum? I am a Solutions Architect for Monitoring & Observability and surprisingly this is a relatable problem in Monitoring domain too. Without proper focus on what data is really needed for monitoring our systems, we end up collecting a lot of noise and garbage. One party that surely benefits out of it is the tool (if licensed) or the one providing you storage. In the heaps of data, mining the events of value requires processing power, more time and higher efforts. Also, people with data literacy knowledge are needed.

We think that by collecting all-there-is-to-collect, we are doing the due diligence, but are we really!

I strongly recommend reviewing your use cases and match them against the data that you currently collect. If you already run matured monitoring implementations, you will be able to cut down on license costs. Sales will always push for collecting everything and you should listen to them but act smart. For instance, Infrastructure monitoring is very noisy as the data is emitted repeatedly at fixed frequency for all servers. In case of cloud and containers, think about localizing the infra monitoring in the cloud providers itself and consume just the alarms in a centralized system. For Applications, instead of all logs, choose Events where applicable, instead of all metrics, choose the ones that will really help establish the health status of the application components, even build your custom metrics; choose traces only where it makes sense to know the trail of events.

In short, create a focused data set that the skilled physicians (support teams) can work with!

“Where this post belongs” in the Observability journey

--

--