Hazelcast In-Memory Caching
In this tutorial, we will be learning what is Hazelcast and when to use it. Hazelcast IMDG is an open-source in-memory data grid based on Java. It is a clustered data-grid, backed by physical or virtual machines. In this tutorial, we will be looking at Hazelcast basic, in the next tutorial we will be implementing Spring Boot + Hazelcast application.
What is Hazelcast?
Let us take an example of a simple Banking Application that performs CRUD operations for User Bank Accounts. Initially, we develop this application using a monolithic approach. The banking application will be deployed into a single server with the JVM running. And the bank details will be saved in the Database. Whenever the user wants to get data it retrieves it from the database and sends it to the Banking application that is run in the JVM.
The major drawback of the above application is that each time a call will be made to the database. Whether it is Create, Read, Update, or Delete this application makes a call to the Database. This will be a network call and this will make the application much slower.
We can make the application faster using a cache. A Cache is a simple in-memory data structure like a map. where the key will be the account number and the value will be the account object. So the account details for a user will be stored both in this map as well as the database.
This will be the architecture of the application route. Here we are having a cache along with the Database. When the data in the database is too large then the Cache will hold only the relevant data based on some cache algorithms. Whenever the user needs the account details for a particular account number, instead of making a call to the database we will be getting it from this Cache and return it. This will save us a network call to the database and so is much faster. This will improve the performance of our application.
There are disadvantages to the above approach.
- Suppose the application load increases a lot as more and more users are now performing operations like adding a new account. The cache we created is usually stored in JVM memory. It has limitations of size.
- To solve the above problem(and also many other problems of monolithic application) we now move to distributed micro-services architecture where we have started multiple instances of our banking application for better load balancing.
Here we will be having different applications(servers). Each of these servers will load the data to the same database. Since each application runs on a different server each one will have a different JVM. And the cache of each application created separately for each JVMs.
Suppose a request comes in and banking application1 to store the bank details for a particular bank Number will store the data in the cache1 which is the own cache of Application1 and as well as the Database. But how will banking application2 and application3 know about this cache update? They are not aware of this transaction. Suppose some other user calls the banking application3 to retrieve the newly inserted bank details. It will not in the cache3. So it will get data from the Database. So there will be a network call required for the new banking details.
Also, suppose the banking application1 has updated or deleted the existing data. Now these updated details not present with application2 and application3. So the cache2 and cache3 will contain invalid/not updated data. This can be a major issue in this application.
To resolve these problems we make use of a distributed in-memory cache i.e Hazelcast.
Hazelcast is an in-memory distributed cache. Suppose that banking application1 is updated its cache Then this change is also reflected in banking application2 and banking application3. So data integrity is maintained across all the cache using Hazelcast.
The goal of In-Memory Data Grids (IMDG) is to provide extremely high availability of data by keeping it in memory and in a highly distributed fashion. The advantages of Hazelcast are,
- Clustering — Hazelcast has a clustered set of Nodes that work in unison
- Distributed — The data in Hazelcast is distributed in all the nodes
- Fault-Tolerant — Hazelcast maintains replicated copies of data.
Suppose JVM1 is down. Still, all the data that is present in JVM1 also present in other nodes as replicated copies of maintaining. So there is no data loss if you are making use of Hazelcast.
- Application scaling — Hazelcast can be scaled horizontally. It is Elastic in nature. New nodes can be added to the cluster, and the data in the nodes get automatically distributed again in all the nodes.
Suppose we have added one more banking application4 all the data that was previously present in 3 nodes will automatically get distributed to the 4 nodes.
So by now, you should have a good knowledge of what is Hazelcast and when to use it. In the next tutorial, we will implement a Spring Boot + Hazelcast application.