Boost the performance and scalability of your Spring Boot application with Hazelcast on Kubernetes

Sofia Gourgari
Julius Baer Engineering
5 min readJan 29, 2024

Would you like to improve the performance and scalability of your Spring Boot application? Would you like to access and process your data in memory across multiple nodes? If your answer to any of these questions is ‘yes’, then you should consider using Hazelcast with Spring Boot.

Hazelcast is a distributed In-Memory Data Grid platform for Java. The architecture supports high scalability and data distribution in a clustered environment. It can handle millions of operations per second with sub-millisecond latency and scale up or down dynamically without any downtime or data loss.

Hazelcast can be implemented using two different types of topology:

· embedded cache topology and

· client-server topology.

In this article, we will focus on the embedded one, where each Spring Boot application instance will also run a Hazelcast member inside its JVM. This way, the application can access the Hazelcast data structures directly without any network overhead. The embedded Hazelcast members will form a cluster automatically and share the data among themselves.

In order to start using the library, you need to add the following dependency to your pom.xml file

<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast-all</artifactId>
</dependency>

Next, you need to create a Hazelcast configuration bean in your application. This bean will define the properties and settings of your Hazelcast instance, such as the data structures and serialisation options.

@Configuration
public class HazelcastConfig {

@Bean
public Config hazelcast() {
Config config = new Config();
config.setInstanceName("spring-boot-hazelcast");
config.getNetworkConfig().getJoin().getMulticastConfig()
.setEnabled(false);
config.getNetworkConfig().getJoin().getKubernetesConfig()
.setEnabled(true);
val usersConfig = new MapConfig("users");
usersConfig.setTimeToLiveSeconds(300);
usersConfig.setStatisticsEnabled(true);
usersConfig.setEvictionConfig(new EvictionConfig()
.setSize(5000)
.setEvictionPolicy(EvictionPolicy.LRU)
.setMaxSizePolicy(MaxSizePolicy.PER_NODE));
usersConfig.setNearCacheConfig(null);

config.addMapConfig(usersConfig);
config.getSerializationConfig()
.addSerializerConfig(
new SerializerConfig()
.setTypeClass(User.class)
.setImplementation(new UserSerializer()));
return config;
}
}

This configuration bean will create a Hazelcast instance with the name “spring-boot-hazelcast” and define a map named “users” with a time-to-live of 300 seconds. It will also register a custom serialiser for the User class, which is the entity class used by the Spring Data JPA repository.

Using this configuration, we define that the map named “users” will have a maximum of 5000 entries per node, and when the limit is reached, the least recently used (LRU) entries will be evicted. The near cache configuration is null, which means that there is no near cache enabled for this map. A near cache is a local cache that stores a copy of the data on the client side for faster access. We can configure every map or other data structure in the cluster separately. By doing the above, we configure the map of users.

Hazelcast provides several mechanisms for discovering the members. If we don’t configure any discovery mechanism, the default one is used, in which Hazelcast tries to find other members in the same network using multicast. The above configuration will disable the multicast discovery and enable the Kubernetes discovery. The Kubernetes discovery mechanism is provided by the hazelcast-kubernetes module, which is included in the hazelcast-all dependency. This module will use the Kubernetes API to discover the other Hazelcast members running in the same namespace. To use this module, you need to grant some permissions to your application pod, such as the ability to list and watch pods and services. You can do this by creating a service account, a role and a role binding for your application.

For example, you can create these resources using the following YAML file:

apiVersion: v1
kind: ServiceAccount
metadata:
name: hazelcast-service-account
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: hazelcast-role
namespace: default
rules:
- apiGroups: [""]
resources: ["endpoints", "pods", "services"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: hazelcast-role-binding
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: hazelcast-role
subjects:
- kind: ServiceAccount
name: hazelcast-service-account
namespace: default

Running the command kubectl apply -f <filename.yaml> will create the service account, role and role binding in the default namespace.

Next, you need to create a service for your Hazelcast cluster that will expose the Hazelcast port and allow the members to communicate with each other. For example, using the following YAML file:

apiVersion: v1
kind: Service
metadata:
name: hazelcast-service
namespace: default
spec:
type: ClusterIP
selector:
app: hazelcast
ports:
- name: hazelcast
port: 5701

Running the command kubectl apply -f <filename.yaml> will create the service in the default namespace. Now you have successfully enabled the Kubernetes discovery mechanism for your Hazelcast cluster. You can find more details about this mechanism in the official documentation.

Let’s assume that your Spring Boot application consists of the UserController class that exposes some operations in the User repository and uses the Hazelcast instance to cache the results. You can use the @Cacheable annotation to mark the methods that should be cached by Hazelcast.

@RestController
@RequestMapping("/users")
public class UserController {

@Autowired
private UserRepository userRepository;

@Autowired
private HazelcastInstance hazelcastInstance;

@GetMapping("/{id}")
@Cacheable(value = "users", key = "#id")
public User getUserById(@PathVariable Long id) {
return userRepository.findById(id).orElseThrow(() ->
new UserNotFoundException(id));
}

@GetMapping
@Cacheable(value = "users", key = "'all'")
public List<User> getAllUsers() {
return userRepository.findAll();
}

@PostMapping
public User createUser(@RequestBody User user) {
User savedUser = userRepository.save(user);
hazelcastInstance.getMap("users").put(savedUser.getId(),savedUser);
return savedUser;
}

}

The getUserById and getAllUsers methods will be cached by Hazelcast using the @Cacheable annotation. The createUser method will update the Hazelcast map accordingly. You can also use the @CacheEvict and @CachePut annotations to control the cache behaviour.

That’s it. Now the application has a distributed cache. When we want to scale our application, every new instance will create a new member and this member will join the cluster automatically.

To demonstrate the benefits of the distributed cache, here is a more concrete example. Let’s assume we run two instances of the application — A and B. If we call the endpoint createUser from instance A to add a new user and then getAllUsers, the user data will be stored in the cache and replicated automatically to instance B. As a result, if the getAllUsers is called from instance B, the updated list of users will come from the cache without querying the database again, because Hazelcast handles the synchronisation and consistency of the data across the cluster.

Using Hazelcast in embedded mode has two main advantages:

· it is easy to set up the cluster because we don’t need to set up a separate cache cluster

· data access is very fast because we don’t need to send a request to a cache cluster over the network when we want to read the data from the cluster.

However, if a system required one hundred instances of the application, it would mean that we would have one hundred cache members, which we don’t necessarily need, and which would consume a lot of memory. In addition, replication and synchronising would be very expensive. In general, we should use the embedded cache topology when we must execute high-performance computing with the data from the cache. However, it is better to use the client-server topology when the deployment of the application is bigger than the cluster cache.

In conclusion, Hazelcast is a powerful and flexible platform that provides fast and distributed in-memory data access and storage for your applications. It also supports various discovery mechanisms, configuration options, and best practices for running on Kubernetes. By using Hazelcast on Kubernetes, you can benefit from the scalability, availability and resilience of both platforms.

--

--