Leveraging Akka and Machine Learning in a Reactive Microservices Architecture

Published in

Capital One Tech

9 min readApr 29, 2019

A new age of technology is upon us! Software is making it easier for end users to instantly achieve what they need to do, when they want to do it, and with the device of their choice. The software architecture empowering these solutions is also becoming more intelligent, responsive, resilient, and scalable. Several of the foundational pillars that make up these architectures include machine learning, reactive frameworks, and microservices.

In a previous post, we discussed how you can leverage open source BPM with machine learning in a reactive microservices architecture. It provided a basic overview of machine learning, Kafka, H20 and explained an approach that is valuable when you have human workflow involved.

In another previous post, we discussed reactive architecture and covered the basic concepts of some of the leading reactive frameworks that run on top of the JVM, such as Akka and Vert.x. These frameworks are very powerful and provide the tools needed to create reactive architectures.

So what if you have a use case that does not involve human workflow, rather mostly system to system interactions that you want to integrate with machine learning in a reactive microservices architecture? Let’s take a closer look at how we can solve this particular problem.

Understanding the Problem

Before we get into solving the problem, it’s always a good best practice to make sure you have a really good understanding of the problem you are trying to solve. This helps you align the right technology and architecture patterns for your particular use case. Often, teams think they know the problem when they’re just at the surface. They need to go a few levels deeper to get to the root. This often requires questioning and continuing to ask “Why?” until you get there. What you want to avoid is landing on a solution that doesn’t fit the problem you are trying to solve. This can result in a solution that is overly complex.

This becomes even more important when using reactive architecture, as additional complexity is added by default due to the introduction of asynchronous concepts.

Integration Options for Reactive Microservices and Systems

When thinking about machine learning and integrating different reactive systems, there are several approaches you can take, depending on the use case. Whenever there are multiple systems you want to integrate, a key principle of reactive architecture is to keep those systems decoupled.

One way this is done is through commands and events, which you can read more about in one of my previous posts. Kafka is a common technology used to implement commands and events. It can be used to integrate microservices within a reactive application to decouple various application components.

Leveraging Kafka within a reactive system can be a good fit:

If the microservices in your reactive system are polyglot in nature, meaning a mix of technologies (e.g. Spring Boot, Java, Akka).
If it’s desired to avoid lock-in to a specific reactive framework for communication.

Kafka can also be used to for integration across reactive systems.

Leveraging Kafka across reactive systems can be a good fit:

If your reactive systems are polyglot in nature (Akka Actor Systems, Java).
If it’s desired to avoid lock-in to a specific reactive framework for cross reactive system communication.

You can also leverage the capabilities of Akka to achieve similar outcomes. Akka enables you to build an Actor System that can contain multiple Actors. For communicating within a reactive system, Actors can communicate with each other through Akka’s built-in messaging capabilities (e.g. Ask, Tell or publish/subscribe via EventStream construct).

Leveraging the Akka Ask or Tell built-in messaging

Leveraging Akka within a reactive system can be a good fit when all of your microservices are part of the same Actor System.

For communicating across Akka systems, Akka HTTP or Akka gRPC can be used for synchronous non-blocking communication while Akka Streams Kafka can be used for asynchronous communication. Each of these approaches provides backpressure as part of the message streaming. Additionally, the other end of the communication can be implemented in something other than Akka.

You can also combine some of these patterns together, which can enable you to use the strengths of each approach.

Leveraging Kafka within and across reactive systems AND leveraging Akka within a reactive system

This approach leverages Kafka for messaging within a reactive system and across Actor Systems. It also leverages Akka for messaging within an actor system. This approach is a good fit when you want to maximize decoupling and support polyglot communications at both the microservice and reactive system level. In the next section, we will show how you can take this pattern and implement an example with a reactive framework and machine learning.

An Example Implementation

Let’s walk through an example of how you can leverage the Akka framework, Lagom, H20, Kafka, and Java-based microservices to implement a machine learning-based solution that uses a reactive microservice architecture. You can also find a presentation and video on this topic that I presented with my colleagues at the 2018 Reactive Summit conference below.

Integrating Machine Learning, Reactive Microservices, and Akka with Kafka — @reactivesummit Reactive18 — Many companies are experimenting with machine learning to see how they can better predict the future…www.reactivesummit.org

In this example, we create a solution for predicting a fraudulent credit card transaction. 28 numerical features are calculated and input into a H20 model that performs the prediction. In my previous post, we provided an overview of Akka, H20, and Kafka. A new technology component we have not discussed before is Lagom. Lagom is an open source framework that makes it easier to wire microservices together (e.g. it can expose Akka Actors as RESTful APIs). In our example, we use it to provide a boot strap for Kafka. Now let’s take a look at the proposed architecture using all of these components. We are reusing a number of the same components from the open source BPM example, please see this post for a detailed walkthrough of the H20 component.

The above solution leverages both commands and events. You’ll see an Akka System with Akka Actors that use the built-in Akka messaging (ask/tell), along with publishing and subscribing to Kafka to communicate with Java-based microservices. The key microservices of this solution include:

Kafka Producer (not shown above) — Java-based microservice used to simulate the system flow by placing the command “Calc Features” on the Kafka topic card.transaction.
Saga Actor — Java-based Akka Actor that subscribes to the card.transaction topic on Kafka. Reacts to the event “Features Calculated” and sends a command via “ask” to the RunModelMS Actor using Akka’s built-in messaging capability, passing the 28 model inputs.
CalcFeaturesMS — Java-based microservice that subscribes to the card.transaction topic on Kafka. Reacts to the command “Calc Features” by calculating 28 numerical features and publishing them back to the Kafka topic with the event “Features Calculated”.
RunModelMS — Java-based Akka Actor that wraps the Java-based H20 model code. Receives a command from the Saga actor to run the H20 model. Runs the model and publishes the output with event “Transaction OK” or “Fraudulent Transaction” to the card.transaction Kafka topic.

The overall flow can be visualized by the below diagram.

Now, let’s take a closer look at the code snippets that are new in this example. The Kafka Producer functionality is implemented by the TestKafkaProducer class that extends the ProducerExample class. ProducerExample creates an Actor System that connects to a local Kafka instance. TestKafkaProducer passes in the “Calc Features” command on the card.transaction Kafka topic.

Our Saga Actor is implemented via a series of four individual components. The first is the application.conf which is found at the root of the class path and is used to pass in configuration information for Akka to the ActorSystem instance. Our application.conf file points to the Saga Actor, which we call the CorrespondenceModule.

play.modules.enabled += com.capitalone.acc.impl.CorrespondenceModule

The CorrespondenceModule.java is where the binding happens with Kafka along with Service Implementation class:

The Service Implementation Class creates an instance of the Saga Actor and the StartEventCmd.

The startEventCmd in the Saga Actor receives the message and sends a tell message to the RunModelMS (in this case called the ProcessAppActor).

The RunModelMS (known as ProcessAppActor in the code) checks the message to see if the features have been calculated.

checks EventId for “Features Calculated”

If the features have been calculated, it will then execute the H20 model, which you may recognize from the example in our previous post.

Handling Failures

At this point, one thing you might be wondering is how can I handle failures in a distributed architecture like this? There are a number of approaches that can be leveraged at the application level. Let’s walk through several below.

Use try / catch blocks to catch exceptions. Notice we did this when executing the H20 model above.
Design your processing to be idempotent. This means design your processing so it does not need to remember where it errored, rather you can start over from the beginning and reprocess. This approach can be a great way to simplify your design, but may not always be possible depending on the use case.
Leverage Kafka consumer groups and only commit the offset when you know processing was successful. With Kafka consumer groups, Zookeeper keeps track of the offset, which is the current point in the message stream to be consumed. Consumers can control committing the offset or auto-commit once the message is read. By controlling when to commit the offset, a consumer can wait for a particular success condition to occur. If a consumer process went down while messages continued to be produced in Kafka, when the consumer comes back up it will be able to process any uncommitted messages.
Leverage Akka Persistence as this provides fault tolerance by storing the internal state of an actor. This enables the state of an actor to be recovered if it crashes. You can find more details on this in my previous blog post.

Summary

Combining the power of Akka, Lagom, H20, and Kafka in a reactive microservice architecture can take your machine learning use cases to the next level. It provides a way to decouple your internal components enabling asynchronous processing while also supporting integration with various polyglot reactive systems through publish and subscribe capabilities. This pattern is a good one for system to system type interaction (non-human workflow). I hope you find this post helpful and thank you for your time!

DISCLOSURE STATEMENT: These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © 2019 Capital One.

DISCLOSURE STATEMENT: © 2019 Capital One. Opinions are those of the individual author. Unless noted otherwise in this post, Capital One is not affiliated with, nor endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are property of their respective owners.