Announcing the Neo4j Crime Investigation Sandbox

Joe Depeau
Neo4j Developer Blog
3 min readDec 12, 2018

Personally, I am always most excited by technology when it can have a positive impact on society and the lives of ordinary people. Curing deadly diseases. Advancing our understanding of our world and our universe. Helping people communicate regardless of distance or language barriers. Preventing and solving crimes.

See examples of the analysis that can be done using the POLE Neo4j Sandbox dataset in this video.

That’s why I’m so excited to announce the availability of a new Neo4j sandbox which we’ve created to show how the power of Neo4j’s graph platform can be used in intelligence-led crime investigations and social services case management.

You can find the sandbox after logging in at: https://neo4j.com/sandbox

The sandbox comes pre-loaded with sample data and a step-by-step guide with queries and explanations — everything you need to get going!

The data for the crime investigation sandbox is organised based on the POLE data model, commonly used in policing and other security-related use cases. POLE stands for Persons, Objects, Locations, and Events. The diagram below shows an overview of the POLE model:

Using the sandbox demo data, our representation of the POLE data model as a graph looks like this:

You can see the node labels that represent the four elements of the POLE model are:

  • Persons: Person nodes and Officer nodes
  • Objects: Object nodes, Vehicle nodes, Email nodes, and Phone nodes
  • Locations: Location nodes (also grouped together into PostCode and Area nodes)
  • Events: Crime nodes and PhoneCall nodes

While this demo implementation of the POLE data model is simpler than a real-world graph might be, it still provides enough complexity and variety to allow us to simulate a number of different scenarios.

  • crime investigations,
  • finding vulnerable people in the graph,
  • using geospatial search to find crimes within a certain distance from a location,
  • and even looking for important persons and potential criminal gangs using two different graph algorithms.

The sandbox database includes over 61,000 nodes and more than 105,000 relationships for you to work with, and the guide includes 21 Cypher queries to run against the sample data. These queries range from simple examples like finding the top 15 locations in the graph for crimes:

MATCH (l:Location)<-[:OCCURRED_AT]-(:Crime)
RETURN l.address AS address, l.postcode AS postcode,
count(l) AS total
ORDER BY total DESC
LIMIT 15

To more complex examples which use graph algorithms to identify potential communities of criminals:

CALL algo.triangleCount.stream(
'MATCH (p:Person)-[:PARTY_TO]->(c:Crime) RETURN id(p) AS id',
'MATCH (p1:Person)-[:KNOWS]-(p2:Person)
RETURN id(p1) AS source, id(p2) AS target',
{concurrency:4, graph:'cypher'}) YIELD nodeId, triangles
WHERE triangles > 0
MATCH (p:Person)
WHERE ID(p) = nodeId
RETURN p.name AS name, p.surname AS surname, p.nhs_no AS id,
triangles
ORDER BY triangles DESC
LIMIT 10;

I hope you find this graph use case as interesting and exciting as I do, and that the sandbox gives you plenty of food for thought on how graph database technology can be used to prevent and solve crimes as well as protect vulnerable people.

Please let us know how you find working with this new Neo4j sandbox, and let us know of any exciting new queries and outcomes you may find!

--

--