How the Lagom framework enables scalable, reactive Microservices in Java and Scala

Lagom is open source microservice framework for building reactive microservice applications in Java or Scala. The pitch for the Lagom framework is that its programming model and architecture allow developers to write microservices that effectively scale across large deployments, that provide desirable application quality-of-life characteristics such as robust error tolerance and application responsiveness, and that take full advantage of the today’s massively-multicore computer hardware.

A Lagom “Hello World” service in Microclimate

Before we dive in, let’s note for the cynics out there that these advantages aren’t just checkbox-checking technical buzzwords that a marketing department has deemed desirable when associated with microservices. Rather, with Lagom these are technical innovations that are fully baked into the architecture of the Lagom framework and are necessarily a consequence of its design. But before we get to that, let’s start at the beginning!

Reactive services and the Reactive Manifesto

Lagom is pronounced Luh-gom, after the Swedish word meaning ‘just the right amount’. The reactive services concepts that underpin Lagom (and its parent project Akka) are described in a document called the Reactive Manifesto. As per the manifesto, reactive services are responsive (respond quickly to requests), resilient (tolerate errors well), elastic (effortless scaling in arbitrary conditions), and message-driven (asynchronous messaging between services).

Reactive is not itself a specific concrete technology: there is no single ‘reactive’ project with source code on a GitHub page, or a single source of Maven dependencies/JAR files that you can add to your project. Rather, reactive is a descriptive term for any technologies that enable the creation of systems that exhibit the capabilities of the Reactive Manifesto.

Reactive technologies have gained prominence in recent years, with the defining characteristics of those technologies implemented in a number of modern Java libraries, such as Akka Streams or RxJava. The Core APIs used by those modern libraries have been integrated into the Java API as of Java 9, with the introduction of the Reactive Streams API, a “standard for asynchronous stream processing with non-blocking back pressure”, under the Java java.util.concurrent.Flow.* package.

IBM+Lightbend = IBM Reactive Platform, incorporating Lagom, Play, and Akka, also included in IBM Microclimate

Lagom is based on the Akka framework and supported by the Play framework. All three technologies are incorporated into the IBM Reactive Platform product, which is a ‘collaborative development initiative’ between IBM and Lightbend. If you’re interested in commercial support for these frameworks, plus advanced enterprise features, you should definitely check it out.

Layered platform architecture diagram

For a quick introduction to these technologies, you can use our Microclimate product to begin to explore the Lagom microservice framework. While it doesn’t include those additional Reactive Platform features, it does allow use of those three foundational technologies: Lagom, Play, and Akka. Microclimate is a container-based development environment, which is available free-of-charge, and which features a built-in browser-based IDE (plus Eclipse and VSCode plugins) for developing Java, Spring, Swift, Node, and Docker-based applications.

Why Lagom?

You may ask, what makes Lagom the best microservices framework versus the many (many) other technologies that exist?

According to proponents of Lagom, common modern programming practices do not properly address the needs of demanding modern systems. And by demanding modern systems, they are referring to the capability for modern systems to make available to applications a frankly ludicrous number of CPU cores. Today you can buy CPUs that feature 28 cores (Intel) or 32 cores (AMD) on a single chip die. Even modern smartphones, desktops, and laptops include between three and eight cores as a standard offering.

As application developers, once we’ve fully saturated the single-threaded performance that we can squeeze out of a single CPU core, the next step for us is to determine how to scale our application across all the remaining cores of the CPU. But critically we need to do that in a way that is both safe and efficient. Now efficient is always good, but safety is crucial: Sharing data between multiple threads is a notoriously difficult thing to get right, and Lagom seeks to provide a programming model that makes getting it right easier for developers.

Lagom also seeks to ensure maximum application scalability in highly demanding conditions. Now, if the goal of your application is to serve only 10 requests per second, or maybe 100 requests per second, you can (arguably) use any modern web technology to write an application that implements this requirement. For example, frameworks that are are based on slower interpreted languages like Ruby and Python are doing this ever day. But what if you want your application to scale to serving thousands or tens-of-thousands of requests on a single machine? With the right technology this is definitely technically feasible, but at this scale, you start to hit fundamental limits of the CPU itself:
- Thread Context Switching: How long your CPU takes to switch between thread contexts. 
- Contention Overhead: How long your CPU threads spend waiting to acquire a resource lock which is owned by another thread
- Blocking on I/O: How long your CPU threads spend blocked waiting for I/O requests, such as file/network/database access

Lagom seems to minimize these bottlenecks by eliminating cross-thread data sharing and blocking synchronous requests, through the user of asynchronous message passing in line with the Reactive manifesto.

The Lagom “Secret Sauce”: Actor model of concurrency

So, the scalability and performance benefits described above are what makes Lagom special, but how does Lagom realize the vision of Reactive services? What is the technological ‘secret sauce’ that unlocks that promised massive scalability?

A core feature of Lagom is its use of the actor model of concurrency (through Akka), plus smart design decisions and supporting APIs that seek to reduce the friction between how we as developers are traditionally taught to write multithreaded code, versus the asynchronous message passing model prescribed here. The actor model of concurrency is nothing new, as it was first published in 1973, thus we again prove that nothing is new in Computer Science that has not been formally defined and analyzed years before you were born 😄 and/or before the modern personal computer era.

The actor model of concurrency is best expressed in contrast to the traditional multithreaded Java concurrency model.

The traditional multithreaded Java concurrency model

In the diagram, you see two threads (red on the left and green on the right) that are both attempting to access a Java object at roughly the same point in time.

Generally speaking, two threads concurrently accessing shared data is fine as long as both threads are only reading that data. But what if one or both of the threads is also writing the data during this time? Now we have a problem: we have entered the oft-nightmarish world of cross-thread concurrency bugs.

For example, what happens if the green thread reads the data while the red thread is in the middle of updating it? Well, the green thread will get an inconsistent value from this method. What happens if both the green and red threads write data to the object, and both return from the object’s method? Each will incorrectly assume that the data they have written is still present.

Java gives us a number of concurrency primitives to combat this: the synchronized keyword, synchronized blocks, various lock types, atomic variables, futures, and the thread ExecutorService. But ensuring that all those concurrency primitives are correctly applied across your application is a mammoth task, and ensuring objects in your application are fully and correctly synchronized is a challenge for which limited supporting tools exists. When these types of bugs do arise, often the only solution is a highly caffeinated programmer staring long and hard at the code until the problem becomes clear.

We know that synchronizing data is hard, but lets presume that somehow you are able to ensure that all shared-thread data is correctly synchronized. Let’s delve deep into our collective imaginations and envision a world where you have synchronized, locked, and futured your way into a complex multithreaded application system which somehow against all odds does not have inconsistent data sharing. 😃

Even in this magical world of perfect mutexes and locks, you are not yet out of the woods when it comes to concurrency bugs: your application may still experience deadlocks, when two-or-more threads create a dependency cycle by both owning a shared lock and needing a shared lock that another thread owns. In the deadlock scenario, both threads are fully dead in the water without manual intervention.

Another contention related issue you will experience even with perfect synchronization is race conditions. Back to our thread diagram: imagine a scenario where 95% of the time the red thread updates the data before the green thread, but 5% of the time instead the green thread updates the data first. Timing on thread ordering can affect overall program behaviour, even though the actual code that is executed in both the 95% and 5% scenarios is exactly the same.

In contrast to the traditional concurrency model: The actor model

In direct contrast to the traditional model of sharing data between threads is the actor model. In this model, an ‘actor’ is a service that contains both some local data, and some local code that is executed on that local data. However, critically, that local code is never allowed to access the data of any other actor. Nor is the thread that actually runs that actor code allowed to touch any other data that is used by any other thread. Actor code is always limited only to that actor’s own data.

In this diagram, the blue circles represent these ‘actors’. You can imagine that the top left actor represents a mortgage service which receives customer requests for new mortgages. In order to process a mortgage, a thread (“thread A”) executes the actor code and checks the actor’s local data to see if the customer already has a mortgage. Once this check is complete, the mortgage actor needs to request the customer’s credit rating to determine the mortgage interest rate.

However, the customer’s credit rating is stored in a separate actor: the blue circle to the immediate right of the mortgage actor. Since the mortgage actor needs to request a credit rating check, it sends an immutable message to the credit rating actor. This immutable credit request message is placed in the incoming mailbox of the credit rating actor.

At this point, the mortgage actor — having other business to complete — continues to executes any remaining code that does not depend on a response from the credit rating application. At some point the mortgage actor will receive a response to its credit rating request, but it is not blocked on the request and can process other data during this time.

Independently, on a new thread (“thread B”) the actor code that handles credit ratings is executed. The credit rating code checks its incoming mailbox, and discovers an incoming message from the mortgage actor. The credit actor then checks its internal database of credit ratings, locates the customer’s credit rating, and sends a message containing that data back to the mortgage actor. This new response message is placed in the mortgage actor’s incoming mailbox, which will be processed the next time the mortgage actor code/thread runs. The credit rating thread then continues its other business.

As you can see, in this scenario these two threads share data between them by passing messages to each other, rather that calling methods on shared Java objects. Notice that at no point did thread A ever have access to the local data of thread B, and vice versa. In both cases, data was shared by passing lightweight message between the two actors.

This lightweight messaging is the foundation of the actor concurrency model: rather than sharing data by multiple threads calling methods of shared objects, data is passed through lightweight messages that are shared between threads via a messaging system.

The actor model drives Reactive programming

So that’s the actor concurrency model, and that’s the model that Akka, Lagom, and Play are based on.

By following the tenants of reactive programming via the actor model, you will get Lagom-based microservices that are:

  • Lightweight
  • Message-driven
  • Use an asynchronous, non-blocking thread model
  • Support Apache Kafka for message passing (and general support for non-Kafka message-broker scenarios)
  • Provide a holistic solution to developing distributed systems (build tools, test tools, Apache Kafka infrastructure configuration)

When building within the constraints (‘opinions’) imposed by Lagom, a Lagom-based application will therefore necessarily have the desirable reactive qualities: responsiveness, resilience, scalability, and elasticity. This, combined with the features described above, make it a compelling choice for moving your application development from a monolithic architecture to a scalable microservices-based architecture.

Trade-offs

Of course, like everything in software development, there are trade-offs to any technical choice and Lagom is no exception. Lagom is a strongly opinionated framework, and like any opinionated framework, the farther you diverge from those opinions the greater the pain. One instance of this is that it can be more difficult to integrate with other (multithreaded) Java libraries that don’t play well Akka’s actor model, for example third-party libraries that manage their own threads with thread pools. These may interfere with Akka’s ability to efficiently pass messages between actors and threads.

Lagom requires you to split your applications into a set of independent services, which will necessarily be more complex than a traditional monolithic application built on a more traditional framework (but of course with a monolithic application you lose the scaling/performance benefits of Lagom). Lagom requires a greater understanding of the vagaries of distributed computing and concurrent data sharing, in order to avoid the pitfalls/“footguns” inherent to both of these topics.

Finally, Lagom is arguably a less mature technology than an industry bedrock like Java EE or Spring: Lagom dates back to 2016 (as per the GitHub commits), while Akka has been around since 2009, and the Play framework has been around since 2007.

Jump In

If you’re looking to dip your toes into the Lagom framework, try creating a Hello World application using IBM Microclimate. Microclimate is a free-of-charge, container-based, multi-language, cloud-friendly development IDE, that you can download and try right now! Microclimate does not include the ‘advanced’ features of the IBM Reactive Platform, but still incorporates Lagom, Akka, and Play, the fundamental open source technologies of Reactive Platform.

If you’re in the market for a microservice platform that is developer-focused, enterprise-friendly, and commercially-supported, check out IBM’s Reactive Platform. It’s a collaborative development initiative between IBM and Lightbend to provide reactive technology builds on Lagom, Akka, and Play, while providing advanced features such application management, intelligent monitoring, and enterprise integrations.