A Gentle Introduction to Coherence

By Harvey Raja and Aleks Seovic

Published in

Oracle Coherence

10 min readJul 11, 2020

Just in case you missed the announcement, there is now a publicly available, open sourced version of Coherence! As a result we are publishing a series of articles that will help you use it effectively.

This initial article series provides a 30,000 foot overview of the major Coherence features, and explains how they work together to simplify development of distributed applications. We will follow up with a series of deep-dive articles, drilling into the details of various product areas that require more in-depth coverage. The combination of the two will allow you to quickly get up to speed and understand which features to use and when.

Let’s tackle the obvious questions you should be asking first: Why do I care? What does it offer me? What can I build with it?

Why do I care?

As an architect of a large, mission-critical web site or enterprise application, you need to address at least three major non-functional requirements: performance, scalability and availability.

Performance is defined as the amount of time an operation takes to complete. Performance is extremely important, as it is the main factor that determines application responsiveness — experience has shown us that no matter how great and full-featured an application is, if it is slow and unresponsive, the users will hate it.

Scalability is the ability of the system to maintain acceptable performance as the load increases. While it is relatively simple to make an application perform well in a single-user environment, it is significantly more difficult to maintain that level of performance as the number of simultaneous users increases to thousands, or in the case of very large public web sites, to tens or even hundreds of thousands. The bottom line is, if your application doesn’t scale well, its performance will degrade as the load increases, and the users will hate it.

Finally, availability is measured as the percentage of time an application is available to the users. While some applications can crash several times a day without causing major inconvenience to the user, most mission critical applications simply cannot afford that luxury and need to be available 24 hours per day, every day. If your application is mission critical, you need to ensure that it is highly available, or the users will hate it. To make things even worse, if you build an e-commerce site that crashes in the run-up to Christmas, your investors will hate you as well.

The moral of the story is that in order to keep your users happy and avoid all that hatred, you as an architect need to ensure that your application is fast, remains fast even under heavy load, and stays up and running even when the hardware or software components that it depends on fail. Unfortunately, while it is relatively easy to satisfy any one of these three requirements individually, and not too difficult to comply with any two of them, it is considerably more difficult to fulfill all three at the same time.

Fortunately, Coherence can help.

What does it give me?

Coherence is a fast, scalable, fault tolerant data store. It provides automatic discovery of cluster members, automatic data sharding, highly redundant data storage, built-in messaging, events for anything that happens to the data or the cluster itself, simple to use APIs, and above all else — a “coherent” system (it is literally in the name).

Coherence is fast. Typical data access operations are usually in low single-digit milliseconds range, and even sub-millisecond for basic key-based operations. Of course, the actual performance is very much determined by the network between the cluster members — the better the network, the faster all Coherence operations are going to be.

An actual trace from a sample application. The REST API call to retrieve user information took 2.15ms, out of which only 0.24ms were spent inside Coherence, to fetch the data from the storage node and deserialize it into in-memory Java object graph.

More importantly, Coherence is incredibly scalable. While some data stores hit their scalability limits at a dozen cluster members or even less, Coherence can easily scale to hundreds of members. As a matter of fact, we regularly run “large cluster” tests of 1,000+ members in our QA environment.

A sample Helidon Sock Shop microservices application running in a 1,000-member cluster with 1TB of allocated heap. We could’ve easily pushed it to more nodes, and we could’ve configured the nodes to use much larger heaps than the 1GB/node that we used, but at this point we were pushing our 6-node Kubernetes cluster to its memory limits

There are two ways to scale: vertically or horizontally. Vertical scaling, or scaling up, requires that you make individual application components bigger, typically by adding more memory and CPUs, to allow them to handle additional load. Horizontal scaling, or scaling out, on the other hand, allows you to add more of each component, which tends to be a much simpler, cheaper and more flexible scaling model — you can always add more servers, but once you’ve bought the biggest available (and most expensive) server on the market, how do you continue to scale up?

Unfortunately, not all types of workloads are equally suitable for both scaling models. While stateless workloads, such as web and application servers, are trivial to scale either way, stateful workloads, such as databases, are generally much easier to scale up than to scale out.

The problem is that most applications require some stateful components, which can ultimately become performance and scalability bottlenecks for application as a whole — once your stateful components are at the limit of their scale, scaling stateless components in front of them will not help. It will actually likely make things worse by placing more load on stateful components that are already bursting at seams. To make things even worse, performance of stateful components typically degrades linearly once they reach limits of scale.

Coherence is a stateful system which can be scaled both vertically and horizontally. Coherence is easy to scale horizontally by starting additional cluster members, or by shutting down excess cluster members. Coherence is also capable of scaling vertically, by being reconfigured to consume more or less CPUs, RAM, and storage.

Coherence allows you to scale stateful workloads horizontally as if they were stateless

Coherence processes/members will automatically form a new cluster, or join one if it already exists, and start the distributed data services your application defines.

A cluster can host one or many data services of various types. The most common data service partitions data across processes (members) that are willing to own data. This partitioning, also referred to as sharding, is entirely automatic.

First member to start will create a new cluster. Other members using the same cluster configuration will join the existing cluster and receive fair share of data partitions.

As highlighted in the video above, Coherence is elastic — when new members join the cluster, partitions storing the data are automatically rebalanced. Similarly, when members leave the cluster, whether intentionally or unintentionally, Coherence will restore the affected partitions from backups, making the data once again available, and then create new backups. This re-balancing, including backup promotion and re-creation, is completely transparent to the application, allowing it to continue submitting requests.

This elasticity also makes Coherence highly available. After all, failures are an inevitable part of distributed systems, and simply need to be embraced — from a practical perspective there is really no difference between intentional loss of a cluster member due to scale-in, and unintentional loss due to failure.

Coherence will place backups as far away as possible from the primary data, which allows it to tolerate member, machine, rack and even site/data center failure when deployed across multiple data centers or cloud availability zones/domains

Coherence attempts to place primary and backup copies of partitions as far away as possible from one another, based on its understanding of the underlying network topology. The goal is to place primaries and backups across known fault lines, such that when faults do occur there is a redundant copy of the data somewhere.

Practically, Coherence will attempt to store a partition’s backup on a different site (data center/AZ/AD), rack or machine (in order of preference) and therefore loss of any one of these can be tolerated. Additionally, simultaneous loss of multiple machines, racks or even sites can be tolerated by increasing the backup count (the recommended default, and what most customers use in production, is a single backup for each primary partition).

This ability to handle the addition and removal of cluster members completely transparently, while remaining fully available to service application requests, allows Coherence not only to handle the unplanned failures and to support scaling out and in, but also to support rolling upgrades of the application, and even of Coherence itself!

Finally, Coherence also provides a disk-based persistence mode in which it will store data to disk prior to acknowledging the write. This allows you to quickly restore the state of the cluster even in the case of complete cluster loss, regardless of whether such loss was planned or unplanned.

Coherence Federation (commercial feature)
Coherence Grid Edition also supports federation, which allows you to have active-active or active-passive deployments across multiple geographically distributed data centers.
This adds yet another layer of redundancy to the system and provides low-latency access for geographically distributed clients.

What can I build with it?

Anything, and everything! Coherence is a foundational technology that is used as the underpinnings of some incredibly high scale trading, gaming, and logistics platforms, as well as many eCommerce sites. One thing all these applications have in common is that they require fast, reliable access to data, regardless of the load.

Whether you’re building “traditional” monoliths, or “new age” microservices based applications, or even “serverless” functions, you are likely to need a fast, scalable store for your application’s (undeniable) state.

It does what it says on the tin, unless told otherwise
When discussing the properties of a distributed system it is often useful to describe them in terms of the CAP theorem. It is also worth noting that different parts of the same system can choose whether they prefer to be CP or AP.
By default, Coherence chooses Consistency and Partition-tolerance. All requests (reads & writes) are serviced by the primary owner of a partition, with each partition’s backup(s) receiving changes applied at the primary synchronously. Backups stand ready to take on ownership if a primary leaves or becomes unresponsive. This provides a predictable, strongly consistent model that you can rely on.
However, there are means to reduce consistency in favor of availability and/or better performance. Features such as near caches and map/cache views can be employed to introduce a local cache of the primary storage that is asynchronously updated, therefore increasing availability and improving performance at the cost of consistency. Additionally, asynchronous backups and persistence allow for faster writes, as clients are notified after writing to primary and not both primary and backup (and potentially disk), but this does introduce the potential for the loss of acknowledged writes.
As is often the case with software architecture, the decision on how to configure and use Coherence ultimately rests on the application architect’s shoulders, and the right choice very much depends on the types of guarantees application requires.
What is important to understand from all this is that Coherence is not necessarily an “all or nothing” solution — you could opt for strong consistency for some parts of the application and choose to weaken the guarantees for others.

This fluid, dynamic system described in the previous sections is surfaced through a trivial API. NamedMap<K,V>, one of the most commonly used data structures/APIs in Coherence, builds on top of the familiar Map<K,V> interface. As a matter of fact, it can be considered a natural extension of the existing Map data structures that is better suited for distributed computing:

Each entry stored in a NamedMap maps to a partition, and that partition is owned by one of the storage members, as shown in the video below:

Each cluster member is aware of the partition assignment/ownership across the cluster, which makes all key-based operations constant cost, O(1) operations (network latency and Java GC notwithstanding)

As the video above shows, while Coherence APIs allow you to write your application as if you were using plain Java objects (POJOs), Coherence doesn’t actually store them as such. Instead, it converts them into a binary format, which is used for both data storage and network transfer.

Some data stores require you to munge your data model ahead of time into a format that particular store requires. This inherently requires some translation layer from the application’s in-memory model, to the store’s optimized model, and then into bytes to be sent to the server and/or disk.

Coherence allows you to skip that unnecessary translation layer, storing your objects your way. Java objects are serialized using a pluggable serializer, and native clients support both direct pass-through (if the storage format and the format used by the client match), or automatic translation between the formats (if they don’t).

Coherence natively supports an optimized Java serialization format, its own Portable Object Format (POF), which we will describe in much more detail in a follow up article, and JSON serialization. You can also implement support for other serialization formats.

Simplify your Architecture

In this article we have discussed in some depth why Coherence is a great way to improve performance, scalability and availability of your applications, and have barely scratched the surface of the NamedMap API. However, building real systems requires a lot more than what we have covered so far. After all, real applications have functional requirements as well!

Fear not — there are many features in Coherence that will not only allow you to build complex distributed applications, but will allow you to do it in a way that is probably much easier and more enjoyable than what you may be used to.

In the remaining articles in this series we will cover the following topics:

How to perform queries and aggregations
How to efficiently modify distributed objects
How to react to data changes, and other changes in the system
How to use built-in messaging features
How to access Coherence from languages other than Java, and how to embed Coherence into Node.js applications with GraalVM
How to build stateful, reactive microservices applications with Coherence and Helidon
How to deploy, and more importantly, observe and manage large Coherence clusters in production

Many applications need a data store, a compute platform, an event fabric, a caching layer, and a messaging layer. Each of these typically has to be scalable, performant, and highly available.

Rather than choosing a different technology for each of these components, and having to understand the peculiarities of how it scales, how to manage it and the limits of its scale, what if you could simply choose one product that does it all?

It may be time to give Coherence a try!