Learn geography using Neo4j

Jimmy Crequer
Nov 30, 2019 · 6 min read
Image for post
Image for post

In a previous story I implemented a small app to learn Japanese characters using Neo4j. Lately, I spent time trying to remember all countries in the world (I guess I have too much free time…), and I figured out I could use a graph to help me in this journey too.

In this post, I will build a small graph of European countries and write a short CLI application to interact with the graph and help learning those countries.

The graph

Build the graph

"name": "France",
"population": 67348000,
"area": 643427,
"capital": "Paris",
"neighbors": ["Andorra", "Belgium", ..., "Switzerland"]

We will create the following entities :

  • Country nodes, with a name, a population and an area
  • City nodes, with a name
  • Relationships between Country and City nodes to represent the capitals
  • Relationships between 2 Country nodes when they have a common border

Neo4j’s APOC library provides a very convenient way to import JSON files. We only need a few Cypher lines to build our graph :

WITH "https://gist.githubusercontent.com/jimmycrequer/7aa867900d0cf0b9588d4354f09cb286/raw/countries.json" AS url
CALL apoc.load.json(url) YIELD value AS v
MERGE (c:Country {name: v.name})
SET c.population = v.population, c.area = v.area
CREATE (capital:City {name: v.capital})
CREATE (c)<-[:IS_CAPITAL_OF]-(capital)
FOREACH (n IN v.neighbors |
MERGE (neighbor:Country {name: n})
MERGE (c)-[:IS_NEIGHBOR_OF]-(neighbor)
Image for post
Image for post
Our graph

Explore the graph

MATCH (c:Country)
RETURN c.name AS country, apoc.number.format(c.area) AS area

Note : “apoc.number.format()” returns a String, and to get the correct sorting we need to “ORDER BY” the numerical value.

Image for post
Image for post

To be honest, I would have thought that Ukraine was bigger than France. Moreover, it seems my data counted Greenland as well which explains why Denmark appears in the top 3.

Image for post
Image for post

We can also calculate the density of population for each country.

MATCH (c:Country)
RETURN c.name AS name,
apoc.number.format(c.area) AS area,
apoc.number.format(c.population) AS population,
c.population / c.area AS density
ORDER BY density ASC
Image for post
Image for post

It is interesting to note the presence of Scandinavia and especially the Baltic states here, despite being relatively small states.

Let’s now have a look at the relationships between countries.

MATCH (c:Country)-[:IS_NEIGHBOR_OF]-(c2:Country)
WITH c, collect(c2.name) AS neighbors
RETURN c.name, neighbors
ORDER BY size(neighbors) DESC
Image for post
Image for post

No really big surprise here. Germany and France have a lot of common borders with small countries (Belgium, Switzerland, Luxembourg) and are located are the center of Europe. Notice that this dataset doesn’t include Asian countries, so Russia and other countries like Kazakhstan do have more bordering countries.

You can also render a “map” of Europe just using the neighborhood relationships and the force-layout.

MATCH (c1:Country)-[nb:IS_NEIGHBOR_OF]-(c2:Country)
RETURN c1,nb,c2
Image for post
Image for post
“Map” of Europe

Lastly, we also make use of Neo4j’s “shortestPath()” function to know how many countries need to be crossed to reach 2 specified countries. Example here with France and Greece.

MATCH (france:Country {name: "France"}), 
(greece:Country {name: "Greece"}),
p = shortestPath((france)-[*]-(greece))
Image for post
Image for post

Now that the graph is ready, let’s create a small CLI app to play with it!

Build the app

Main function

First, I connect to the Neo4j instance and create a new session. Then I create the main loop of the application, which redirects the user to which game they choose to play to.

Let’s dive into other functions.

GuessCountryFromCapital function

The code is pretty straightforward. I use the following Cypher query to return a pair of Country and City at random.

MATCH p = (:Country)<-[:IS_CAPITAL_OF]-(:City)
RETURN apoc.coll.randomItem(collect(p)) AS p

Then the user is prompted the question. Finally we display a message whether his answer was correct or wrong, using simple text coloration.

console.log('\x1b[32m%s\x1b[0m', 'Correct!')

This line will print “Correct!” in green.

console.log('\x1b[33m%s\x1b[0m', `Wrong! The answer is ${countryName}.`)

This line will print the message in yellow.

GuessCountryFromNeighbors function

This time, the Cypher query will fetch all the countries and their neighbors, then I randomly pick one from the “records” property in JavaScript.

And that’s it. Run the app and you can start playing!

Sample execution

Image for post
Image for post


I deeply think graph databases are useful for learning because we tend to remember more easily new knowledge by forming associations with what we already know. The very efficient method of loci, which is about remembering an ordered list of things by visualizing them to familiar locations, demonstrate this. You can associate each item of your list to :

  • A room of your house
  • A shop in your preferred street
  • A street in your childhood city
  • A station of your commutation train

Every place that is familiar to you will help you remember any thing. It’s all about connecting things together!

To remember where Albania is, I could learn that its coordinates are 41.1533° N, 20.1683° E, but it would be way more efficient and easy to remember if I just learn that it is the most left country on the North of Greece. Of course, to make it work I need to know where Greece is, but once we get a solid common knowledge, it is really easy to connect additional new things to it!

While this post is still trivial, by taking advantage of Neo4j’s nature it is really easy to add more nodes from additional datasources to densify this graph and extend the learning potential. In a next post, I will try to add some additional datasources and provide new questions like :

  • Seas : “Which countries have a border with the Mediterranean Sea?”, “Which European countries have border with no seas?”, …
  • Mountains : “Which countries are the Alps in?”, …
  • Rivers : “Which rivers are traversing through France?”, …

Happy learning!

Neo4j Developer Blog

Developer Content around Graph Databases, Neo4j, Cypher…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store