Apache Ignite with persistence!

Umesh S
5 min readApr 30, 2020

--

From last couple of years, I have been working on this amazing product from Apache called Ignite ,which is, an in- memory computing platform and also providing options for distributed caching. The beauty of this product as compared to other options , in my experience, is that it is fairly simple to use and configure for the basic needs. Of course, if you want to make advanced usage you need to really put your head into it! But that is always the case for any product.

In this article, I would like to focus on the Ignite’s two modes of distributed caching — Pure in memory & with native persistence. Let’s start with a brief introduction.

Definition

Ignite™ is a memory-centric distributed database, caching, and processing platform for transactional , analytical, and streaming workloads delivering in-memory speeds at petabyte scale.

So looking at this official definition , it essentially means that Ignite is meant for a really high speed processing since the operations are carried out in memory. Now Ignite can be used as a distributed cache (or distributed database), or as distributed compute engine ( wherein it will divide a given task among the member nodes and after receiving answers from them recombine it to produce the final result. The mechanism generally called as MapReduce).

Now the distributed caching feature of Ignite comes broadly in 3 flavors —

  1. Pure In memory cache
  2. Native persistence
  3. 3rd Party persistence

We are focusing on option 1 & 2 for the sake of this article. So let’s dive deeper into them now with examples. In either of cases, we will observe behavior in the following scenarios —

Testing scenarios

a. Distributed cache ,nodes can access each other’s data

b. One node goes down, other can still fetch the data created by first node.

c. Both are restarted, There is no data loss. (ONLY applicable for native persistence)

Pure In memory cache

As the name implies, this is a scenario wherein the data is stored purely in memory. So it is a volatile option and should be used only if you don’t care about the data losses on server shut downs. We will take a simple spring boot application which does CR operations for employee system using Ignite and we run the two instances of the same application with different ports (in this case 8080 & 8090). That would mean we form an Ignite cluster with these two applications. To know more about the Ignite cluster, please read here.

So we start by creating an employee named Stephen using POST endpoint -

a. Distributed cache

now it is created , let’s try accessing it from both applications

So voila! , both applications can access the data!

b. One node goes down

Now let’s stop one instance of application , say running on port 8090 -

and let’s see now what happens to the second instance GET request -

So as expected, other instance can still fetch the data created by first one. That wraps up our discussion on the in-memory mode.

Cache empowered with native persistence

Cache with the persistence mechanism of Ignite itself. In this case the data is stored on the disk and most frequently used data is in memory for faster access. Obviously, since the data is on the disk, it is safe from any data loss after server restart. Here are more details about it. The main difference in ignite-config file is the addition of the following section —

<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
</bean>
</property>

<property name="walMode" value="LOG_ONLY"/>
<property name="storagePath" value="${storagePath}"/>
</bean>
</property>

Let’s see it in action. So we start by creating an employee named Stephen using POST endpoint -

a. Distributed cache

let’s access it from both applications

as expected , both applications can access the data similar to in-memory scenario.

b. One node goes down

let’s stop one instance of application , say running on port 8090 -

fire GET from second instance -

yes, it can still fetch the data created by first one.

c. Both nodes are restarted

Now, here comes the differentiating factor when we restart both the nodes. So ,let’s stop both nodes

now both servers are down and hence both endpoints are not accessible anymore. Let’s start them now & try to access the good old employee Stephen.

First node -

Second node -

Voila! as you can see, the employee management system recovered from the server crash & was able to fetch the employee created before it died!

So that wraps up our discussion on the Ignite persistence with native mode. The source code for the same is available here.

Summary

So to conclude, we saw how the in memory mode is good in case you don’t care for the data and ready to loose it on server restarts. And, if you do care, please use Ignite with native persistence option. That’s all folks for this article. Hope you liked it!

--

--