Digging for reliability? Meet Neo4j

The Ksquare Group
The Ksquare Group
Published in
4 min readApr 25, 2019

Author: Jaqueline Caamal & Mike Uc

Relational databases (RDBMS) have been the traditional software application for storing data since the ’80s, a very long time right? This database works good when you have predictable data that fits well into tables and rows BUT in the real world, it has some scalability issues as data volume increases. And that’s a big deal. We know it’s hard to replace your existing database infrastructure for another, and you are only going to do it if the improvement of the new one is huge.

Well, graph databases hit the mark! That’s because, in contrast to the relational model, they use nodes, edges, and properties as primary elements. The design of this database type is for data whose relationships are interconnected with an undetermined number of relationships between them so the performance will stay consistent, even if your volume of data increases in the future.

The real world is fully interconnected and graph databases are highly useful in this kind of applications: social network, public transport, maps of roads or network topologies, among others.

In this article, we’re going to talk about a specific graph database: Neo4j.

Give a good node (or several billion) to your connection

Neo4j was initially released in 2007. It is a robust graph-oriented database, scalable with a high performance which allows you to manage a graph of billions of nodes and relationships in a single server. Amazing! But… what about the reliability of the data? Well, Neo4j performs ACID (Atomicity, Consistency, Isolation, and Durability) transactions, which allows the authenticity of the data.

According to DB-Engines ranking, it is by far the most popular and fastest among all graph-oriented database management systems.

Today, Neo4j is used mainly by international companies like Adobe, eBay, Walmart, Microsoft, and Airbnb. Need more?

Now, how does it behave on social media?

For our evaluation we used real data from Pokec, the most popular online social network in Slovakia.

The purpose of this test is to examine the performance of Neo4j in terms of execution time with four different workloads:

  • Massive Insertion Workload (MIW).
  • Query Workload:
    — FindNeighbor:
    Finds the neighbors of all nodes.
    — FindAdjacent: Finds the adjacent nodes of all edged.
    — FindShortestPath: Finds the shortest path between the first node and 100 randomly picked nodes.

To show how Neo4j works better than other RDBMS we are going to take a few samples of the Pokec dataset. First, we inserted 1k nodes, later 5k and then 10k.

For massive insertion we observe that Neo4j handles the data more efficiently when we increase the sample size, considering that the difference in time between each sample of insertion is minimal and the inserted nodes are increasing in a large degree. So, if we think about it, the more data you have, the more efficient it is!

Massive Insertion (MIW)

For the second workload, we measure the execution of 3 types of queries: FindNeighbor (FN), FindAdjacent (FA) and FindShortestPath (FS).

Query Workload

As we can see, the times of FA are the fastest based on the fact that the query only has to search the adjacent node of each edge, making just clear jumps and, with the samples, the difference in time is minimal in opposition to FN and FA. This is basically because they have to go through other nodes making the search slower. Now, if we compare FN and FS, we can conclude that FS performs the search faster than FN. This is because only 100 random nodes are being taken, so we can wrap up that if the number of random nodes is increased, we definitely could see that the difference in time would be greater.

In terms of efficiency and, above all, speed, migrating to an improved graph database platform like Neo4j can make a significant difference.

Run your database at the speed of a tortoise

or the Flash

Which one would you prefer?

--

--

The Ksquare Group
The Ksquare Group

We are technology innovators. Creating, designing, and building digital solutions.