Ensuring System Integrity: The Right Path to Health Checks in Microservice Architecture

Have you ever wondered about the right approach for health checks in your application? Let’s explore an interesting perspective!

Published in

Cloud Native Daily

4 min readJun 21, 2023

Imagine a microservice architecture with multiple services running in Kubernetes and having dependencies on each other.

🏢 In this oversimplified example, the services are directly dependent on one another, rather than being indirectly connected via a message bus or broker. While it may not be ideal, it’s not uncommon.

Let’s dive into the scenario:

📡 Each service has a liveness probe that verifies its connection to other dependent services. 💥 Now, picture a temporary blip in the network connection between “Service X” and the database, causing a 30-second connectivity loss before it’s restored.

Here’s the twist:

⚠️ Within those 30 seconds, the liveness probe fails, triggering Kubernetes to restart the application. 🔄 Consequently, Service Y fails to connect to Service X, leading to its liveness probe failure and subsequent restart. The pattern continues, causing cascading failures across services, even though most of them don’t depend on the failed service (the database).

🎯 There are two different approaches to designing health probes:

1️⃣ Smart probes aim to verify the application’s correct functionality, its ability to handle requests, and connect to dependencies like databases or message queues.

2️⃣ Dumb health checks indicate only whether the application has crashed. They focus on basic requirements, such as responding to an HTTP request, without checking dependency connections.

💡 Striking the right balance:

In my opinion, here’s the approach I prefer:

✅ Dumb liveness checks: Focus on determining whether the application is alive. Think of it as a “restart me now” flag. If restarting the app can fix the health check, it should be part of the liveness probe. For example, if Kestrel can handle requests, the health check should pass.

✅ Smart startup checks: During startup, perform due diligence for the application. Validate database or message bus connections and ensure the app’s configuration is valid. Startup is the best time for these checks, as configuration errors are common during deployment in Kubernetes.

🚦 Regarding readiness checks, it’s a bit more complex. In most cases, I struggle to find scenarios where the application is alive and handling requests (as checked by liveness probes), has completed startup checks (as verified by startup probes), but shouldn’t receive traffic (as indicated by readiness probes). One possible situation could be an overloaded app that needs time to process requests, but it’s not something I’ve encountered often.

⚠️ Checking dependencies in readiness probes can lead to cascading failures. Instead, it might be fragile to take apps out of circulation based on CPU utilization or RPS. Moreover, readiness probes run throughout the application’s lifetime, so they shouldn’t add unnecessary load to the app itself.

              +-------------------+
              |  Kubernetes Pod   |
              +-------------------+
              |                   |
              |    Containers     |
              |    +---------+    |
              |    |         |    |
              |    |  App    |    |
              |    |         |    |
              |    +---------+    |
              |                   |
              +-------------------+
                        |
                        |
                        v
              +-------------------+
              |   Health Checks   |
              +-------------------+
              |                   |
              |   Liveness Probe  |
              |   +-------------+ |
              |   |             | |
              |   |    App      | |
              |   |             | |
              |   +-------------+ |
              |                   |
              |   Readiness Probe |
              |   +-------------+ |
              |   |             | |
              |   |    App      | |
              |   |             | |
              |   +-------------+ |
              |    Start Probe     |
              |   +-------------+ |
              |   |             | |
              |   |    App      | |
              |   |             | |
              |   +-------------+ | 
              |                   |
              +-------------------+

In a nutshell:

Liveness Probes: Keep it simple and focus on determining if your application is alive and can handle requests. This helps detect crashes and trigger necessary restarts.
Startup Probes: Take a smarter approach during initialization. Verify essential dependencies like database connections and configuration validity. Thorough checks at startup prevent configuration errors from causing issues later on.
Readiness Probes: Here’s the tricky part. Continuous checks throughout the application’s lifespan ensure readiness to receive traffic. Avoid dependency checks to prevent problems from spreading. The need for readiness checks varies, so I’d love to hear your experiences!

By using simple liveness checks and comprehensive startup probes, we strike a balance between failure detection and preventing cascading issues in microservice architectures.

🤝 I’m curious to hear your thoughts on readiness checks other than, what we have discussed above!

Reference :

Configure Liveness, Readiness and Startup Probes

This page shows how to configure liveness, readiness and startup probes for containers. The kubelet uses liveness…

kubernetes.io

Microservices Monitoring: Cutting Engineering Costs and Saving Time

A few ways fort leveraging Helios to save on engineering costs and dev time for a more resource-efficient organization…

gethelios.dev

Testing Microservices - Trace Based Integration Testing Example

Microservices architectures require a new type of testing. Here's why traditional testing fail and the new automated…

gethelios.dev

OpenTelemetry Tracing: Everything you need to know

OpenTelemetry tracing is filling the gaps of traditional observability methods in microservices apps. Here's how it's…

gethelios.dev

OpenTelemetry: A full guide

Learn all about OpenTelemetry OpenSource and how it transforms microservices observability and troubleshooting

gethelios.dev

Kubernetes Monitoring with OpenTelemetry

Learn how to monitor Kubernetes using OpenTelemetry with real-time visibility and granular error data - Reduce MTTR by…

gethelios.dev

7 Best Tracing Tools for Microservices

Decide The Best Tracing Tools For Your Microservices Architecture

medium.com

Scaling Microservices: A Comprehensive Guide

medium.com

Ensuring System Integrity: The Right Path to Health Checks in Microservice Architecture

Have you ever wondered about the right approach for health checks in your application? Let’s explore an interesting perspective!

In a nutshell:

Configure Liveness, Readiness and Startup Probes

This page shows how to configure liveness, readiness and startup probes for containers. The kubelet uses liveness…

Further Reading:

Microservices Monitoring: Cutting Engineering Costs and Saving Time

A few ways fort leveraging Helios to save on engineering costs and dev time for a more resource-efficient organization…

Testing Microservices - Trace Based Integration Testing Example

Microservices architectures require a new type of testing. Here's why traditional testing fail and the new automated…

OpenTelemetry Tracing: Everything you need to know

OpenTelemetry tracing is filling the gaps of traditional observability methods in microservices apps. Here's how it's…

OpenTelemetry: A full guide

Learn all about OpenTelemetry OpenSource and how it transforms microservices observability and troubleshooting

Kubernetes Monitoring with OpenTelemetry

Learn how to monitor Kubernetes using OpenTelemetry with real-time visibility and granular error data - Reduce MTTR by…

7 Best Tracing Tools for Microservices

Decide The Best Tracing Tools For Your Microservices Architecture

Scaling Microservices: A Comprehensive Guide

Written by Akash Jaiswal