Understanding Modern Observability: Challenges, Tools, and Best Practices

Varun Subramanian
CodeX
Published in
5 min readMar 19, 2023

--

“The ability to measure the internal states of a system by examining its outputs.”

“Observability requires insight into metrics, traces, and logs — the three pillars.”

The history of computer architectures brings us to the widely used 3-Tier Architecture. This foundational layout is made up of three key components:

  • Presentation Layer: This layer houses web applications, built with HTML, CSS, and JS. It communicates with the other layers through an API, or Application Programming Interface.
  • Business Logic Layer: This core layer hosts the business logic, the rules that drive the main functions of an application. It’s set up in a Virtual Machine (VM), performing complex operations that run the application.
  • Data Management Layer: As the backbone of any application, this layer includes our databases. They can be hosted in-house or on the cloud. Common databases like MySQL, PostgreSQL, and Microsoft SQL Server manage data read/write access.

Within the 3-Tier Architecture model, a single machine can host around 20–30 applications. However, when problems arise, solutions may involve:

  • Debugging on the spot in the production environment.
  • Operating without metrics, akin to navigating blind.
  • Creating War Rooms, where teams work intensely to solve issues.
  • And when all else fails, the good old RESTART.

Over time, as systems became more intricate, the 3-Tier architecture was replaced by the Multi-Tier architecture. This model introduced more complex architectures to handle growing application complexity. This led to a need for Observability, a way to monitor system health and performance as the interaction between multiple systems increased the failure rate.

Despite the unchanged responsibilities from the old architecture to the modern ones, the increased complexity introduced more CHAOS. The old tools, designed for simpler times, weren’t enough to identify and diagnose issues in these complex systems.

Addressing the Problems:

With these challenges, vendors developed tools to tackle the issue of “Observability”. However, this influx of tools led to new complications. Teams ended up with isolated solutions, there were no proper integrations between these tools, and the use of multiple agents led to redundancy and inefficiency. While striving for “Observability”, they often fell short.

In their pursuit of Observability, companies ironically lost sight of their architecture’s Observability.

Ideal Modern Observability Tools:

We believe that the perfect modern Observability tools should:

  1. Monitor, debug, and manage hybrid & multi-cloud applications and infrastructure effectively. These tools need to provide a complete view of applications across various platforms and pinpoint issues quickly.
  2. Accomplish the first condition with minimal tools to reduce costs and maintain operational efficiency. Over-complication can add to the problem rather than solving it. The key is simplicity, effectiveness, and efficiency.

Identifying the Observability Stack:

To identify the perfect stack for Observability, we need to select a stack that fulfils several key conditions. These include:

  • Logs: Detailed records are essential for tracking and diagnosing issues.
  • Application Performance Monitoring (APM): This feature monitors and manages the performance and availability of software applications.
  • Real User Monitoring (RUM): Understanding the real user experience is key to improving product usability and satisfaction.
  • Metrics: These offer measurable indicators of system performance and health.
  • Synthetic Monitoring: This feature tests system functionality by simulating user behaviour.
  • Real-time service maps: These provide a visual overview of system interactions and dependencies.
  • SAML support: Security Assertion Markup Language (SAML) support is critical for secure authentication and authorisation.
  • Fine granular permissions: These allow for precise access control, accommodating different roles within the organisation.
  • A large third-party library of extensions to support all technologies, particularly databases, message queues, networks, and databases: This ensures compatibility and seamless integration with various technologies.
  • A beneficial license model for our environment: The licensing model should be cost-effective and flexible, fitting the specific needs of our environment.

Overview of Modern Observability Tools

This section covers common observability tools in the market, along with their pros and cons.

1. Dynatrace

Pros:

  1. Extensive capabilities for monitoring application performance
  2. Robust AI features enabling automatic problem detection.
  3. Commendable customer service and technical support

Cons:

  1. The cost may be prohibitive, particularly for smaller firms
  2. The complexity demands a certain level of training to use effectively
  3. Customizability and flexibility are somewhat limited

2. Appdynamics

Pros:

  1. Tried and tested solution for enterprise-level clientele
  2. Admirable support for a range of technologies
  3. Straightforward to understand and implement

Cons:

  1. The pricing, structured per host, can be steep
  2. The interface may be challenging for novices
  3. Customization options for alerts are somewhat limited

3. Datadog

Pros:

  1. Provides inclusive monitoring and analytics across systems, applications, and services
  2. User-friendly dashboards for data visualization
  3. Robust integration capabilities with a variety of tools and platforms

Cons:

  1. The cost can be substantial depending on the number of hosts
  2. Usage of advanced features may necessitate a certain level of technical proficiency
  3. Alert system can be overly sensitive and may require fine-tuning

4. Elastic Cloud

Pros:

  1. Scalable and adaptable, effectively managing increasing data volumes
  2. Robust capabilities for search and analytics
  3. Compatibility with a variety of tools and platforms

Cons:

  1. The setup and management can be complex
  2. The cost can be considerable for larger data volumes
  3. Limited customer support availability

One can find more comprehensive information about the tools in their websites.

Conclusions:

Ultimately, every modern observability tool on the market has its strengths and weaknesses. Engineering decisions often involve choosing between these tools based on their trade-offs. However, in my opinion, the decision should be primarily driven by the long-term goals of your organization and how they align with your chosen tool.

In conclusion, the market is so competitive that the differences between tools are diminishing. New features are being released rapidly. At this rate, nearly all providers will soon offer a comprehensive Full Stack Observability solution.

References:

  1. New Relic. (2021). What is Observability?. Retrieved from https://newrelic.com/observability
  2. Dynatrace. (2021). Observability vs Monitoring. Retrieved from https://www.dynatrace.com/platform/observability-vs-monitoring/
  3. AppDynamics. (2021). What is Observability?. Retrieved from https://www.appdynamics.com/observability/
  4. Datadog. (2021). Monitoring vs. observability. Retrieved from https://www.datadoghq.com/monitoring-guide/monitoring-vs-observability/
  5. Elastic Cloud. (2021). What is observability?. Retrieved from https://www.elastic.co/observability

--

--