Neo4j — get off the ground in 30min or less!

As with any new technology, the first step into the unknown is uncertain and sometimes scary. Questions on where to start or how to learn can sometimes overwhelm the motivation to dive in. The toughest part of any new thing is starting. Today, I hope to ease the uncertainty and get you past the initial obstacles and pass along some resources for further development.

What is a graph database?

Graph technology has been around for a long time, but graph databases are a relatively new entry to the technology market. Similar to the relational model, graph model is a way to store and retrieve data. However, the difference between relational and graph is how the data is stored in the database. While relational organizes the data into table and column structures and crunches data that falls outside the norms into this format, graph accepts the data exactly as it exists and creates a model that fits the data.

From this explanation, it may seem that graph would likely be very unstructured, but that is actually not the case. A property graph model consists of entities (nodes) and connections between the entities (relationships). Each node and relationship is able to include properties that contain additional metadata about itself. To better visualize the models, they are compared in the image below.

Image credit: William Lyon (https://www.slideshare.net/lyonwj/natural-language-processing-with-graph-databases-and-neo4j)

Instead of creating an intermediary table of Person-Friend in the relational model and using it as the reference point between the Person and Friend tables, a graph stores each relational row as a separate entity and eliminates the need for reference/lookup IDs that a relational structure relies upon. This allows for exceptionally fast lookups of data segments from a graph database, because the database is not traversing 1m+ rows in the Person table to find a particular row.

In the example above, the relational model would traverse the Person table until it finds Andreas’s row (assuming only one row exists for Andreas), then search the Person-Friend table for any connections to Andreas that reference his id, then look up those reference rows to find the details about each friend.

In contrast, a graph model simply finds Andreas’s node (which is stored in a particular place in memory, so that the query does not need to search the entire graph), then looks for all of the relationships with type `KNOWS` to find all of the nodes that know Andreas.

As you can imagine, the more straightforward graph approach results in much faster traversal times that provide results back to the user faster. While the relational model is perfectly suited to structured, tabular data, the graph model’s strength lies in highly-connected data.

These last few paragraphs are a very high-level overview of the structure. For additional information and depth, I have included resources at the bottom of this post that more thoroughly explain graph databases and their models.

How to download Neo4j

Neo4j’s main database product follows the property graph model and provides ways for users to store and retrieve their data in a graph database. There is both an open source and commercially-supported version. Neo4j for desktop includes a free license to Neo4j enterprise edition, which provides development and POC capabilities that are normally part of a commercial license. Developers are able to download and install Neo4j for desktop to learn how to use it and create smaller-scale or proof-of-concept projects without any fees or strings attached.

Neo4j also provides a basic way to interact with the graph database without any downloads via the Neo4j sandboxes. Several use cases and tutorials have been created in a web-based interface for developers to interact with Neo4j without adding any software to devices.

For this post, we will walk through downloading and using the Neo4j for desktop application. This sets developers up to create applications and interact with a Neo4j database or run a small cluster of local Neo4j servers for testing.

To install Neo4j for desktop, follow the link and click `Download`.

This will take you to a page that downloads the software and shows steps for installing Neo4j on your device (displays Windows, Mac, or Linux instructions, as applicable). Step 1 on the page simply covers this installation step for your device. Step 2 walks you through how to open and use the application, which we will cover here, as well.

Start and navigate Neo4j for desktop

When first loading the application, there is a brief info form to create an account for yourself in the application. This allows different users to access and work with their own projects without interference from others. You can sign in with an existing Google, GitHub, LinkedIn, or Twitter account, or you can create a separate account for Neo4j.

Once signed in, the main screen for the application is presented (screenshot below). Along the left-hand side of the window are some high-level icons that show the list of projects for the user (folder icon), allow users to adjust application settings (gear icon), show the user profile (person icon), and give basic info for Neo4j as a company (Neo4j logo icon).

Under the project folder icon is a list of any or all projects for that user. If you are just starting with Neo4j, then there is only one default project in the list called `My Project`. You can create a new project with the `+ New` at the top, or work in the default project, which we will do here.

In the right pane, you can create, modify, run, and delete databases, as well as install plugins to use. Plugins allow developers to unlock additional capabilities, depending on the use case they are trying to solve.

`APOC` (Awesome Procedures on Cypher) is a group of procedures and functions that give developers extra functionality that may not be included in the product package. This blog post gives an introduction to more details on APOC, and more resources are linked at the end of this post.

The `Graph Algorithms` plugin includes a set of algorithms to analyze a data set for patterns and data science purposes. More information can be found on the Neo4j developer pages.

`GraphQL` allows users to interact with a GraphQL endpoint to specify and return the exact data needed. For more information on the GraphQL plugin, there is an excellent blog post by William Lyon to get started with it.

Creating and working with a database

To create a database, you can click the `New Graph` section under the `My Project` header and choose `Create a Local Graph`.

Create a graph database
Create a local graph

Then, you can fill out a name and password for your database (can be more creative than mine below), as well as choose a version and click the `Create` button. Just note that it may take a few minutes to download the version of Neo4j and create the database.

Graph database details
Graph created!

To start the database, simply click the `Start` on your new database. To view details and interact with the database, click on `Manage`. This brings you to a new pane with a top and bottom pane.

The top panel shows some basic functions you can execute, such as `Start`, `Stop`, `Restart`. It also has two additional functionalities to open the directory where the application holds all its files (`Open Folder` button) and to open a Neo4j Browser window to interact with any data in the database (`Open Browser` button).

The bottom panel has several more options. In the `Details` section, it shows the version of Neo4j running and the status of the database. It also shows the ports that you can use to interact with the database via HTTP, HTTPS, and Bolt protocols.

If you click on the `Logs` tab, it shows all of the streaming log output from the database. The `Terminal` tab allows you to interact with the database via the command line (including to retrieve data from an exposed endpoint).

`Settings` allows you to adjust the configuration for the database, as needed. You can also search this tab using `Ctrl+F`/`Cmd+F`. The `Plugins` tab shows the available/installed plugins on that particular database.

The `Upgrade` tab contains a list of Neo4j database versions, giving you the opportunity to adjust the version or upgrade from here. The final tab for `Administration` lets users update the password for the database.

Interacting with the database

There are a couple of different ways to interact with your newly-created Neo4j database. We have already mentioned opening the Neo4j Browser through the `Open Browser` button within Neo4j for desktop. You can also open a new window in your preferred browser and type http://www.localhost:7474 into the URL. To connect, you will need to enter the password you entered for your database and click `Connect`.

From here, you can run tutorials (for more info, see our browser guide tutorial). You can also insert your own data and run queries from the browser to interact with the data.

Another way to interact with the database is by creating an application in your preferred programming language, then interacting via the application. If any endpoints are exposed in the application, you can also interact using the Neo4j Browser. To learn more, check out our language guides for your particular language.

Recap

In this post, we have given a foundation for how to get started with Neo4j, as well as providing a few additional resources and ideas for deeper learning. Neo4j for desktop application allows you to easily start an instance of Neo4j and interact with data loaded into the graph database in a user-friendly manner.

From here, there are endless possibilities to load your own data, create simple and useful applications to interact with the database, or apply your own proof-of-concept and test out a solution to a valuable business problem. Whatever your need, I hope that this post has helped and given you the stepping stone to doing incredible things with Neo4j. Happy learning!

Resources