Miere Liniel Teixeira
3 min readNov 9, 2016

--

Thanks for asking Dennis.

On my opinion, it is just a matter of “the right tool for the right job” here. It is just a matter of know the down-sides of each technology and use it for the right job. I do believe that Hazelcast seems to be a good fit to volatile data — such as cache layer or to store precomputed data — and Aerospike on other hand would be a good as primary persistence layer, once it could handle huge datasets at lower hardware costs.

TR; TL;

Aerospike is an outstanding NoSql solution which its approach to store data is quite similar to Cassandra’s Column Family and its row identifier that makes fast to retrieve data from huge tables. And, despite being widely used on data analysis due to its good performance on huge Datasets (~100Tb), I do think it would be a nice cache alternative to Hazelcast, specially if you intent to use Hz as an external standalone database.

But, to be fair with Hazelcast in this case, we need to pay attention to a small detail on the cache design we choose to cluster user sessions on Kikaha. Here, our JVM application has an Hazelcast instance embedded inside our applications acting like a cache layer. Due to its gossip protocol, once one or more Hazelcast nodes are able to contact our application, they can share sessions in a non-centralized fashion, with no single point of failure or bottleneck.

The Kikaha’s cluster architecture itself would be quite similar to the picture below. It allows us to retrieve the stored information with almost no latency, specially if the information we need is already stored on the local JVM.

Using Aerospike we ought to retrieve data from outside our JVM — which isn’t exactly an issue once most of Aerospike’s operations took a couple of milliseconds to send a reply back.

On other hand, as primary persistence layer, I would probably choose Aerospike instead of Hazelcast. The first reason would the fact that Hazelcast is more sensible to network latency than Aerospike — on my tests running on AWS I figured out that Hazelcast standalone nodes of would frequently become unreliable on networks that frequently stutter. This is a well-known issue and would be a big deal if you don’t follow the Hazelcast guidelines to run a cluster on AWS.

Something that came to my mind while I was testing Aerospike was the fact it was able to handle huge a Datasets on a more modest hardware than Cassandra, Neo4J and Hazelcast itself. I would encourage you to take a look at the Jepsen test made against Aerospike. It has a lot of details about the Aerospike internals, how it behaves on a very stressed environment and also compares the synchronization algorithm choose by Aerospike team to Cassandra and Riak. I’ve learned a lot about Aerospike there.

Regards!

--

--

Miere Liniel Teixeira

Developer, Entrepreneur, Punk Rocker, Friend, Husband and Dad!