Akka Actor System Health Check

Eren Avşaroğulları
Aug 4 · 3 min read

Most of the applications have multiple internal and external dependencies and their readiness & liveness statuses are important for service quality and SLA.

When Akka Actor pattern is used, ActorSystem will also be one of these dependencies and its lifecycle can also need to be checked as the other dependencies.

This article aims to show how to define an ActorSystem Health Check to track its health status at runtime by defining a separated/isolated ActorSystem. Let’ s have a look for sample implementation as follows:

Used Technologies

  1. JDK v1.8

2. Scala v2.13

3. Akka v2.5.23

Setup

1- We can define 2 Actor Systems.

app-actor-system: is used for service’ s computation layer.

app-health-actor-system: is dedicated ActorSystem to track all the service dependency health checks(e.g: service itself, its dependencies(Database, Zookeeper, other ActorSystem(s) etc…)):

2- HeartbeatActor running on app-actor-system can be defined. This actor will be used to decide if app-actor-system is alive or not:

3- StackOverflowErrorActor running on app-actor-system can be defined. This actor is used to simulate fatal-error case on app-actor-system. It waits TriggerStackOverflowError message and then will throwStackOverflowError:

4- HealthStatusType defines app-actor-system’ s all expected statuses such as UP, UNKNOWN and DOWN:

5- ActorSystemHealthChecker checks app-actor-system status by sending HeartbeatRequest to HeartbeatActor and expects HeartbeatResponse in timeout (5 secs). If it gets consecutive akka.pattern.AskTimeoutException in FailureThreshold (3 times), it will set app-actor-system status as DOWN, otherwise UP or UNKNOWN in the light of FailureThreshold. FailureThreshold can also be defined as property so it can be managed by property file in need.

6- ActorSystemHealthApp is defined to run the sample application. It initializes ActorSystemHealthChecker and schedules the following runnables.

  • ActorSystemCheckRunnable: : runs every 10secs and logs app-actor-system status.
  • ErrorOnActorSystemRunnable : is scheduled once and is used to trigger StackOverflowError on app-actor-system.

In this example, standalone application has been used. Also, ActorSystem’ s health statuses can be exposed through http/tcp.

For example: What happens when a fatal-error(e.g: StackOverflowError) occurs on an actor?

If a fatal-error occurs on an actor, its ActorSystem will be shut down as default, and then all actors managed by this ActorSystem will be stopped as well. In this case:

a- As the first-step, Root-Cause-Analysis (RCA) of fatal-error will need to be done and proper fix should be applied.

b- Akka supports akka.jvm-exit-on-fatal-errorproperty. This can be set on (default)/off (depends on the case). However, when fatal-error occurs, JVM will exit as default. This(default) behaviour can be preferred most of the use-case. However, there may still be some (rare) use case to not good fit. For example: after first service JVM exits, if same operation is applied to next service instance, it may also get down and this can also cause for all other nodes. To avoid for these kind of cases and keep problem scope limited (specially, multi-tenanted environment and when request rooting policy is consistent-hashing), jvm-exit-on-fatal-error property can be set as off. In this case, ActorSystem will be shut down but JVM will not be exited at once. After catching the DOWN Status on ActorSytem, required actions can be applied.

7- akka.jvm-exit-on-fatal-error property can be set as off in order to simulate DOWN status and avoid to spread the problem(e.g: Fatal Errors => StackOverflowError, OutOfMemoryError ) to the other service instances.

or this can set programmatically as follows:

8- Please find the application trace as follows:

9- Please see the complete example as follows: here

Conclusion

In this example project, we implemented a sample Akka ActorSystem Health check. Akka also supports readiness and liveness checks through akka-management. Its documentation can also be useful to have a look.

References

Akka

Eren Avşaroğulları

Written by

Contributor @TheASF | Data Engineer @Workday | Functional Programming & Distributed Systems Enthusiast

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade