What’s New in Neo4j 3.4: Spatial Features

Neo4j 3.4.0 was recently released, and it has some big new features in it. While there are good roundups of what’s in this release overall, I wanted to focus in on one of my favorites, the spatial features and go a bit deeper with some technical examples.

In this article we’ll cover how to create points with neo4j 3.4, how to compute distances between points on the globe, and then go one step further — by themselves those are good features, combined with other things you could already do with neo4j, it gets even better.

First though, we’ll need some data with latitude and longitude so we can place some points on a map.

Getting Started With Data

We’ll need some data to show how this works.

Here’s a picture of what the source data looks like:

And here’s the cypher to load it. If you’re new to neo4j, you can find documentation on how LOAD CSV works here.

This loads some simple data on around 7,000 world cities, complete with latitude and longitude into a simple graph of cities, countries, and provinces. The spatial part of this load process is simply the “location” attribute of the city. Note that we create a spacial point by passing an object with latitude and longitude to the “point” function, taking care to make sure that the data type we’re passing in is a float and not a string.

LOAD CSV WITH HEADERS FROM 'https://simplemaps.com/static/data/world-cities/basic/simplemaps-worldcities-basic.csv' as line 
CREATE (c:City {
name: coalesce(line.city, ''),
name_ascii: coalesce(line.city_ascii, ''),
location: point({ latitude: toFloat(line.lat),
longitude: toFloat(line.lng) }),
population: coalesce(line.pop, -1)
})
MERGE (country:Country {
name: coalesce(line.country, ''),
iso2: coalesce(line.iso2, ''),
iso3: coalesce(line.iso3, '')
})
MERGE (province:Province {
name: coalesce(line.province, '')
})
CREATE (c)-[:IN]->(province)
CREATE (c)-[:IN]->(country)
MERGE (province)-[:IN]->(country)
return count(c);

Here’s a picture of what a resulting snippet of the graph looks like, connecting cities, provinces, and countries. In this case, we’re just looking at West Yorkshire in the UK.

Cities, provinces, and countries

Point Data

Now we have a totally new datatype in our graph, the “point” attribute on the City label, which is a spatial point.

They look like this:

This contains about the information you’d expect, an x and a y coordinate on the globe. The “srid” bit is to refer to a spatial reference. If you’re not familiar with geospatial data prior, there are many different “coordinate reference systems” (CRSs). The default one that many people are used to is called WGS 84, and that’s what you get by default.

Coordinate Reference Systems

If your data uses a different CRS, the only difference in the load would be passing a crs attribute to the point function, such as in this example:

point({x: 2.3, y: 4.5, crs: 'cartesian'})

For supported CRSs and further examples, consult the documentation.

Distance Between Points

Now that we have points on the globe, we can compute distance between them. Let’s use our city data to find out which cities in the UK are closest to London.

MATCH (c1:City { name: "London" })-[:IN]->(:Country { name: "United Kingdom" })<-[:IN]-(c2:City) 
WHERE c2.name <> "London"
WITH c1, distance(c1.location, c2.location) as dist, c2
RETURN c1.name, dist, c2.name ORDER BY dist ASC LIMIT 10;

In bold, we compute the distance by simply calling the distance function on two spatial point types. The results are quoted in meters. Our results below tell us that Luton is the closest city to London, about 47km away from London. These distances are computed “as the crow flies”. Google maps has Luton about 54km from London, but you can’t drive straight there and it’s possible the latitude and longitude are marked from slightly different points, but generally this result checks out.

Technical details about how distance is computed, what the result data type is under the various CRSs, and other important details can be found in the documentation.

Using Distances as Weights

Let’s now compute all-pairs distances within the UK by going through all the cities, computing their distances, and adding edges.

MATCH (c1:City)-[:IN]->(:Country { name: "United Kingdom" })<-[:IN]-
(c2:City)
WHERE id(c1) < id(c2)
CREATE (c1)-[r:PATH { distance: distance(c1.location, c2.location) }]->(c2)
RETURN count(r);

This query creates something like a distance atlas in our “graph map” like the picture below.

This can now be used as a great jumping off point for weighted paths through the graph, when using APOC graph algorithms.

What you do with the new features is up to you! But let us know how you’re coming along, tweet to @neo4j and show us what you’re doing with it!