Updated: Efficient Neo4j Data Import Using Cypher-Scripts

How the new Cypher parser in Neo4j 4.2 made imports 10x faster

What’s New in Neo4j 4.2?

Small Recap

No Optimization

CREATE (:Foo:`UNIQUE IMPORT LABEL` {name:”foo”, `UNIQUE IMPORT ID`:0});
CREATE (:Foo:`UNIQUE IMPORT LABEL` {name:”bar”, `UNIQUE IMPORT ID`:1});
...

Unwind Batch

UNWIND [{_id:3, properties:{age:12}}] as rowCREATE (n:`UNIQUE IMPORT LABEL`{`UNIQUE IMPORT ID`: row._id}) SET n += row.properties SET n:Bar;

Unwind Batch Parameters

:param rows => [{_id:4, properties:{age:12}}, {_id:5, properties:{age:4}}]UNWIND $rows AS row
CREATE (n:`UNIQUE IMPORT LABEL`{`UNIQUE IMPORT ID`: row._id}) SET n += row.properties SET n:Bar;

Neo4j 3.5

$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('3.5_exportDataCypherShellNoOptimizations.cypher',{format:'cypher-shell', useOptimizations: {type: 'none'}, batchSize:100})"
real 0m44.871s
user 0m1.354s
sys 0m0.178s
$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('3.5_exportDataCypherShellUnwindBatch.cypher',{format:'cypher-shell', useOptimizations: {type: 'unwind_batch', unwindBatchSize: 20}, batchSize:100})"
real 0m29.257s
user 0m1.397s
sys 0m0.181s
$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('3.5_exportDataCypherShellUnwindBatchParams.cypher',{format:'cypher-shell', useOptimizations: {type: 'unwind_batch_params', unwindBatchSize:100}})"
real 0m25.333s
user 0m1.393s
sys 0m0.182s
$ time cypher-shell -u neo4j -p davide < "import/3.5_exportDataCypherShellNoOptimizations.cypher"
real 100m24.805s
user 5m39.444s
sys 4m7.330s
$ time cypher-shell -u neo4j -p davide < "import/3.5_exportDataCypherShellUnwindBatch.cypher"
real 31m33.870s
user 1m12.383s
sys 0m30.247s
$ time cypher-shell -u neo4j -p davide < "import/3.5_exportDataCypherShellUnwindBatchParams.cypher"
real 10m28.723s
user 8m4.257s
sys 0m5.748s

Neo4j 4.1

$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('4.1_exportDataCypherShellNoOptimizations.cypher',{format:'cypher-shell', useOptimizations: {type: 'none'}, batchSize:100})"
real 0m42.675s
user 0m1.437s
sys 0m0.218s
$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('4.1_exportDataCypherShellUnwindBatch.cypher',{format:'cypher-shell', useOptimizations: {type: 'unwind_batch', unwindBatchSize: 20}, batchSize:100})"
real 0m30.574s
user 0m1.399s
sys 0m0.214s
$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('4.1_exportDataCypherShellUnwindBatchParams.cypher',{format:'cypher-shell', useOptimizations: {type: 'unwind_batch_params', unwindBatchSize:100}})"
real 0m25.393s
user 0m1.376s
sys 0m0.221s
$ time cypher-shell -u neo4j -p davide < "import/4.1_exportDataCypherShellNoOptimizations.cypher"
real 135m37.920s
user 4m32.836s
sys 3m43.420s
$ time cypher-shell -u neo4j -p davide < "import/4.1_exportDataCypherShell.cypher"
real 44m13.016s
user 0m53.779s
sys 0m28.362s
$ time cypher-shell -u neo4j -p davide < "import/4.1_exportDataCypherShellUnwindBatchParams.cypher"
real 10m8.991s
user 8m39.109s
sys 0m5.342s

Neo4j 4.2

$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('4.2_exportDataCypherShellNoOptimizations.cypher',{format:'cypher-shell', useOptimizations: {type: 'none'}, batchSize:100})"
real 0m42.951s
user 0m1.379s
sys 0m0.207s
$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('4.2_exportDataCypherShellUnwindBatch.cypher',{format:'cypher-shell', useOptimizations: {type: 'unwind_batch', unwindBatchSize: 20}, batchSize:100})"
real 0m29.523s
user 0m1.392s
sys 0m0.213s
$ time cypher-shell -u neo4j -p davide "call apoc.export.cypher.all('4.2_exportDataCypherShellUnwindBatchParams.cypher',{format:'cypher-shell', useOptimizations: {type: 'unwind_batch_params', unwindBatchSize:100}})"
real 0m25.900s
user 0m1.381s
sys 0m0.203s
$ time cypher-shell -u neo4j -p davide < "import/4.2_exportDataCypherShellNoOptimizations.cypher"
real 122m23.241s
user 4m28.974s
sys 3m40.094s
$ time cypher-shell -u neo4j -p davide < "import/4.2_exportDataCypherShellUnwindBatch.cypher"
real 36m51.066s
user 0m51.777s
sys 0m27.773s
$ time cypher-shell -u neo4j -p davide < "import/4.2_exportDataCypherShellUnwindBatchParams.cypher"
real 2m21.473s
user 0m42.900s
sys 0m3.190s

Conclusions

--

--

Developer Content around Graph Databases, Neo4j, Cypher, Data Science, Graph Analytics, GraphQL and more.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Davide Fantuzzi

Backend Developer @ Switcho. Big fan of music and oxygen, for different reasons but both help me live.