Introduction to Graph Databases

Kelsey Whitehead
3 min readJan 30, 2017

--

This is the first part of a series of posts that will take a look at graph databases. Later posts in this series will be using Neo4J as an example in order to examine and demonstrate the functionality and use of a graph database, but for this post I will be providing a general overview of graph databases including their structure and the potential benefits of using a graph structured database.

What is a graph database?

We’ll start with a very basic introduction to graphs — a graph is an abstract data type typically used to model data that contains objects (referred to as nodes or vertices), and some sort of relationship between those objects (lines or edges).

Graph

A graph database uses its nodes to contain data. Typically a node is used to represent an entity object, and is comparable to a row in a relational database. Each node also contains the properties of that entity, which can be anything the business needs to keep track of (ID numbers, location, etc.) These properties correspond to the columns in a relational database. Graph databases also contain edges between each node. These are use to describe the relationships between the objects contained in the database, and in most they can be compared to the foreign key column of a row in a RDBMS.

Graph database structure showing nodes and vertices

Why use a graph database?

The ability to directly state the relationship between two objects within a graph database provides a strong advantage over relational database systems for certain datasets. Namely, those that contain a large amount of relationships within the data.

Graph databases may also offer increased efficiency for searches in some cases. The best example of this is when a business needs to perform a search that is more than one level deep. As an example, let’s consider a database storing corporate structure information.

What sort of query might we need to perform on this data? What if we wanted to find all employees who work in a department with a specific employee, Sarah? (that is, all of her direct co-workers). We would first need to find which department(s) she currently works for by searching the ‘Department_Employee’ table, then search that table again for all other rows matching the retrieved Department_id’s in order to retrieve the Employee_id numbers for her co-workers, then finally the ‘Employee’ table to retrieve the names of her co-workers. That’s a lot of searching! Using a graph database, the process becomes somewhat simpler

In this system, we would first search for Sarah, and see which departments she belongs to. Then simply follow the links into each department backwards in order to discover the other members. For searches such as this, the efficiency is significantly increased.

Graph databases may not be the optimal choice for every business, but they have their advantages in certain situations. If you’d like to find out more make sure to check back for the rest of the series. I will be demonstrating use of an actual graph database, Neo4J. We’ll look at generating a fresh database, transforming existing RDBMS into graph, and using Cypher to make queries.

--

--