Observability & monitoring — Part 01

Observability is a property of a system which indicates whether the internal states of the system can be determined based on the external outputs. On the other hand monitoring is an activity we execute to identify possible issues, estimate capacities,etc.

If there is no monitoring in a system, we cannot even be sure whether the service is working. So, it is very important to have a thoughtfully designed monitoring infrastructure. Following is a model developed by Google engineers on developing and running distributed systems based on Maslow’s hierarchy of need.

Source : https://landing.google.com/sre/book/index.html
Without monitoring, you have no way to tell whether the service is even working; absent a thoughtfully designed monitoring infrastructure, you’re flying blind.

It is important to review which characteristics the system needs to be observed and which monitoring system will be used for the observation.

There are three pillars of observability i.e. metrics, tracing and logging. Monitoring is basically used in terms of metrics monitoring.

Based on my past experience, it is learnt that trying to build a monitoring system which will identify all possible failures is an impossible task. Rather we should focus on a good monitoring system which can identify a failure when it happens and helps in post mortem analysis. We should also be able to detect severe anomalies and avoid such failures as well.

Monitoring system should address two questions: what’s broken, and why?

In summary, Observability is a property of a system and Monitoring is an activity we perform on a system.

While Observability covers a larger scope, monitoring is mainly used in terms of metrics monitoring.

I will be discussing on metrics monitoring in my next post. Stay tuned :)

Update :

Next post at https://medium.com/the-devops-journey/observability-monitoring-part-02-d4d81b67c09a