Testing your Neo4j-based Java application

Published in

Neo4j Developer Blog

11 min readJan 17, 2019

In this post I’m presenting different approaches how you can test your Neo4j-based application. As you might know, Neo4j can be accessed from a variety of languages — Go, Python and JavaScript being some of them — this post focuses on Java based applications. I’m an engineer on Neo4j’s Spring Data team, so this is where my focus usually is.

This post has been reviewed in detail by my colleagues Gerrit Meier and Michael Hunger. Thanks a lot!

There are several aspects that needs to be considered while making the decision for or against a certain test setup.

Are you developing a stored procedure for the database itself?
Are you using the official Java-driver to issue Cypher-queries directly over Bolt?
Are you using an Object-Graph-Mapping library like Neo4j-OGM to build an application-side domain model?
Did you add Spring Data Neo4j (SDN) to the mix to take advantage of the Spring ecosystem?

I’ll cover those four scenarios which allow for a good comparison of the options we have at hand for testing queries or a domain model against the Neo4j database.

I’m going to use JUnit 5 in all scenarios. At the beginning of 2019, there’s hardly any reason not to use JUnit 5 in new projects. All Neo4j specific techniques demonstrated in this post can be be applied with some adaption to JUnit 4 as well.

The examples in this article will all handle spatial values and work with the functions defined there on. The spatial datatype point is new in Neo4j 3.4 and Neo4j-OGM will support it out-of-the-box in Neo4j-OGM 3.2 and Spring Data Neo4j 5.2.

Neo4j test-harness

The full example how to use the test-harness for custom Neo4j extension is on GitHub: using-the-test-harness.

Testing custom Neo4j extensions

Neo4j can be extended with custom procedures and functions. One can also add unmanaged server extensions to Neo4j, that expose arbitrary JAX-RS endpoints from the database. In all three cases, one can interact directly with the database Java API for all kinds of operations that require direct interaction with the database engine for the highest degree of performance or flexibility.

This is where the Neo4j test-harness comes in. The test-harness is a special variant of an embedded Neo4j server instance with hooks to provide test fixtures and adding your custom procedures and extensions.

Given the following user defined procedure, that converts spatial attributes:

public class LocationConversion {
    @Context
    public GraphDatabaseService db;

    @Procedure(
        name = "examples.convertLegacyLocation", 
        mode = Mode.WRITE)
    public Stream<NodeWrapper> apply(
                                @Name("nodes") List<Node> nodes) {

        return nodes.stream()
            .filter(LocationConversion::hasRequiredProperties)
            .map(LocationConversion::convertPropertiesToLocation)
            .map(NodeWrapper::new);
    }

    static boolean hasRequiredProperties(Node node) {
        return node.hasProperty(PROPERTY_LONGITUDE) &&
               node.hasProperty(PROPERTY_LATITUDE);
    }

    static Node convertPropertiesToLocation(Node node) {

        Map<String, Object> latLong =
            node.getProperties(
            PROPERTY_LATITUDE, PROPERTY_LONGITUDE);
        PointValue location = Values.pointValue(
            CoordinateReferenceSystem.WGS84,
            (double) latLong.get(PROPERTY_LONGITUDE),
            (double) latLong.get(PROPERTY_LATITUDE)
        );

        node.removeProperty(PROPERTY_LATITUDE);
        node.removeProperty(PROPERTY_LONGITUDE);
        node.setProperty(PROPERTY_LOCATION, location);

        return node;
    }
}

The LocationConversion operates directly on the graph database nodes for optimal performance. It is meant to be executed from Cypher with a call like this: CALL examples.convertLegacyLocation(nodes). If you followed the instructions on how to package your stored procedures you would have ended up with a JAR file containing the executable code. Do you want to package it, stop your server and upload it everytime for testing it? Probably not.

Enter the test-harness:

<dependency>
    <groupId>org.neo4j.test</groupId>
    <artifactId>neo4j-harness</artifactId>
    <version>${neo4j.version}</version>
    <scope>test</scope>
</dependency>

There’s a variant neo4j-harness-enterprise that matches the commercial enterprise version of Neo4j, too.

With JUnit 5 you don’t need a @Rule to start it as JUnit 5 supports non-static initialization methods for tests when the lifecycle of the test is set to PER_CLASS.

@TestInstance(TestInstance.Lifecycle.PER_CLASS) // <1>
class GeometryToolboxTest {
    private ServerControls embeddedDatabaseServer; // <2>

    @BeforeAll // <3>
    void initializeNeo4j() {

        this.embeddedDatabaseServer = TestServerBuilders
            .newInProcessBuilder()
            .withProcedure(LocationConversion.class) // <4>
            .withFunction(GetGeometry.class)
            .withFixture("" // <5>
                + " CREATE (:Place {name: 'Malmö', longitude: 12.995098, latitude: 55.611730})"
                + " CREATE (:Place {name: 'Aachen', longitude: 6.083736, latitude: 50.776381})"
                + " CREATE (:Place {name: 'Lost place'})"
            )
            .newServer(); 
    }
}

The lifecycle of this test should be PER_CLASS, so that initializeNeo4j(), annotated with @BeforeAll runs exactly once.
The variable embeddedDatabaseServer to hold a reference to the server during all tests.
initializeNeo4j() runs before all tests and uses a builder to create a test server, the builder provides interfaces for registering
the class GetGeometry.class for custom procedures and functions as well as
fixtures, either through Cypher statements like here, files or even functions.
finally, start the embedded server

Now it’s really easy to use the server provided by the harness. I have added the Java-Driver as test-dependency to the project and open up a connection as against a standalone server-instance or cluster:

@Test
void shouldConvertLocations() {
    try (Driver driver = GraphDatabase.driver(
           embeddedDatabaseServer.boltURI(),driverConfig);
        Session session = driver.session()
    ) {        StatementResult result = session.run(""
          + " MATCH (n:Place) WITH collect(n) AS nodes"
          + " CALL examples.convertLegacyLocation(nodes) YIELD node"
          + " RETURN node ORDER BY node.name");

        assertThat(result.stream())
            .hasSize(2)
            .extracting(r -> {
                Value node = r.get("node");
                return node.get("location").asPoint();
            })
            .containsExactly(
                Values.point(4326, 6.083736, 50.776381).asPoint(),
                Values.point(4326, 12.995098, 55.611730).asPoint()
            );
    }
}

Using the test-harness for application level tests

Technically, the test-harness and the embedded server, reachable through ServerControls, can be used for application-level-testing. Besides the Bolt-URI it exposes the HTTP-URI as well as the embedded graph database instance itself. Both URIs use random, free ports and thus allow tests in parallel. The ServerControls are an autoclosable resource and as they start relatively quick, they can be fired up multiple times.

It comes with a price, however: In the end it is a full blown Neo4j server instance with all the dependencies. You might not want those dependencies in your application, not even in the test scope. For instance compatibility with Scala versions can be an issue. The other disadvantage is the fact, that you’re running the test database inside the same JVM as your application. Most of the time this is not what production looks like. While being on the same JVM is the correct place for stored procedures, it is not for applications. Using an embedded database for testing your application code might lead you into a false sense of safety.

Neo4j Testcontainer

The full example how to use the Testcontainers Spring Data Neo4j based applications is on GitHub: using-testcontainers.

What are Testcontainers?

Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.

František Hartman from GraphAware wrote a very detailed article about Integration testing with Docker Neo4j image and Testcontainers. František already covered a lot here and you should check this out.

In the meantime, our pull request to add Neo4j support has landed in a recent Testcontainers release. Neo4j container describes the basic usage of the official container.

General setup of Testcontainers with JUnit 5

As stated earlier, I have become a big fan of JUnit 5. Nice assertions are one reason, package private test methods, better lifecycle management and extensions are another.

Testcontainers comes with support for JUnit 5, the following listing shows all dependencies:

<dependency>
    <groupId>org.junit.jupiter</groupId>
    <artifactId>junit-jupiter-engine</artifactId>
    <version>${junit-jupiter.version}</version>
    <scope>test</scope>
</dependency>

<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>neo4j</artifactId>
    <version>${testcontainers.version}</version>
    <scope>test</scope>
</dependency>

<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>junit-jupiter</artifactId>
    <version>${testcontainers.version}</version>
    <scope>test</scope>
</dependency>

At the time of writing, testcontainers.version is 1.10.5 and junit-jupiter.version is 5.3.2.

I recommend the following setup for an integration test with Testcontainers:

@Testcontainers // <1>
public class PlainOGMTest {

    @Container // <2>
    private static final Neo4jContainer databaseServer 
      = new Neo4jContainer();  // <3>
}

When run, extend this test-class with the @Testcontainers extension
The extension will look for all attributes marked as @Container and start and stop them according to their lifecycle. Here, databaseServer points to our Neo4j Testcontainer.
Create (but don’t start) a Neo4jContainer

As JUnit 5 tests have a default lifecycle of PER_METHOD, shared state needs to be defined as static attributes of the test. Hence, the definition of the Testcontainer as a staticattribute. This way, the container is started before all tests and closed afterwards. If the container is defined as an instance attribute, it’s restarted before each individual test.

While it is possible to change the lifecycle of the test class to PER_CLASS instead of PER_METHOD, it’s a bit harder later on to configure it for Spring Boot Test-Slices.

Also applicable for both plain Neo4j-OGM and SDN test is the way to provide a test-fixture. This can be done in a @BeforeAll method like this:

static final String TEST_DATA = ""
    + " MERGE (:Thing {name: 'Thing'  })"
    + " MERGE (:Thing {name: 'Thing 2'})"
    + " MERGE (:Thing {name: 'Thing 3'})"
    + " CREATE (:Thing {name: 'A box', geometry: ["
    + "   point({x:  0, y:  0}), "
    + "   point({x: 10, y:  0}), "
    + "   point({x: 10, y: 10}), "
    + "   point({x:  0, y: 10}), "
    + "   point({x:  0, y:  0})] }"
    + ")";

@BeforeAll
static void prepareTestdata() {
    String password = databaseServer.getAdminPassword(); // <1>

    AuthToken auth = AuthTokens.basic("neo4j", password);
    try (
        var driver = GraphDatabase.driver(
            databaseServer.getBoltUrl(), auth); 
        var session = driver.session()
    ) {
        session.writeTransaction(work -> work.run(TEST_DATA));
    }
}

databaseServer is the container we defined and started above. It provides access to the database password (1). The container also provides an accessor to the Bolt-URI which contains a random port.

The @BeforeAll method is invoked once before all tests. I provide the test data over Bolt, so I have the Neo4j Java-Driver on the classpath. Having a static string here is one option, but you can read in your test-data anyway you want.

Using with Neo4j-OGM

The only thing you need to test your business logic based on Neo4j-OGM and queries is a Neo4j-OGM SessionFactory. I recommend defining it as a static variable through a second @BeforeAll method in the test as well:

private static SessionFactory sessionFactory;@BeforeAll
static void prepareSessionFactory() {    var ogmConfiguration = new Configuration.Builder()
        .uri(databaseServer.getBoltUrl())
        .credentials("neo4j", databaseServer.getAdminPassword())
        .build();    sessionFactory = new SessionFactory(
        ogmConfiguration,
        "org.neo4j.tips.testing.using_testcontainers.domain");
}

Again: No hardcoded password, no hardcoded Bolt-URI. The Neo4j-Testcontainer provides this. One possible test with the above data could be this:

@Test
void someQueryShouldWork() {

    var query = "MATCH (t:Thing) WHERE t.name =~ $name RETURN t";
    var result = sessionFactory.openSession()
        .query(query, Map.of("name", "Thing \\d"));

    assertThat(result).hasSize(2);
}

This test runs over the network against a "real", server-mode Neo4j-instance. Just as your application hopefully will.

Using with Neo4j-OGM and SDN

For me there’s no good reasons to start new Spring projects without Spring Boot. Spring Boot brings you — among other nice things — autoconfigured tests and more important, test slices. Test slices deal specifically with certain, technical layers of your application. Being either database layer, service layer or just the web-frontend.

Regarding the database layer it’s an integration test very much focussed on interaction with the database.

The Neo4j test-slice is called @DataNeo4jTest :

@Testcontainers
@DataNeo4jTest // <1>
public class SDNTest {

    @TestConfiguration // <2>
    static class Config {

        @Bean // <3>
        public org.neo4j.ogm.config.Configuration configuration() {
            return new Configuration.Builder()
                .uri(databaseServer.getBoltUrl())
                .credentials(
                    "neo4j", databaseServer.getAdminPassword())
                .build();
        }
    }

    private final ThingRepository thingRepository;

    @Autowired // <4>
    public SDNTest(ThingRepository thingRepository) {
        this.thingRepository = thingRepository;
    }
}

This activates Spring Datas repository layer and also provides Spring Boot’s JUnit 5 extensions
A @TestConfiguration adds to Spring Boot’s config but doesn’t prevent autoconfiguration
A bean of Neo4j-OGMs configuration will be created to configure the SessionFactory of Spring Data Neo4j
JUnit 5 together with Spring Boots extension allow constructor based injection, even in tests

Now you can test against the ThingRepository as shown below:

public interface ThingRepository 
   extends Neo4jRepository<Thing, Long> {

    List<Thing> findThingByNameMatchesRegex(String regexForName);

    @Query(value
        = " MATCH (t:Thing) WHERE t.name = $name"
        + " RETURN t.name AS name, examples.getGeometry(t) AS wkt")
    ThingWithGeometry findThingWithGeometry(String name);
}

A “boring” test would look like this:

@Test
void someQueryShouldWork() {    var things = thingRepository
       .findThingByNameMatchesRegex("Thing \\d");
    assertThat(things).hasSize(2);
}

Why is this boring? Because we’re basically testing whether Spring Datas query derivation works or not.

Testing findThingWithGeometry is much more interesting, as you may recognize examples.getGeometry(t) as our own, custom procedure. How do we get this into the test container? Turns out the authors of Testcontainers thought of a method to mount paths in the container before it starts.

I packaged the custom stored procedures from the beginning of this article into a JAR files name geometry-toolbox.jar and added it to the test resources. With this the Testcontainer can be created like this:

@Container
private static final Neo4jContainer databaseServer = 
    new Neo4jContainer<>()
    .withCopyFileToContainer(
        MountableFile.forClasspathResource("/geometry-toolbox.jar"),
        "/var/lib/neo4j/plugins/")
    .withClasspathResourceMapping(
        "/test-graph.db",
        "/data/databases/graph.db", BindMode.READ_WRITE)
    .withEnv(
        "NEO4J_dbms_security_procedures_unrestricted",
        "apoc.*,algo.*");

The plugin-jar gets copied into the right place inside the container and is recognized by Neo4j during startup. The test data for the second test isn’t hardcoded like in PlainOGMTest.java. I copied over the graph.db folder from my "production" instance to the test resources. Calling withClasspathResourceMapping() maps it into the containers /data/ volume, where Neo4j expects the database. In a real-world test you probably have that data folder somewhere else and not in your project. In such cases, you would use withFileSystemBind() of the Testcontainer.

In the setup above, withEnv() is used to remove any security restrictions from APOC and algorithms extensions by setting the environment variableNEO4J_dbms_security_procedures_unrestricted accordingly.

Anyway, given the same test data as before, a not so boring test now is green:

@Test
void customProjectionShouldWork() {    var expectedWkt
        = "LINESTRING (0.000000 0.000000,"
        + "10.000000 0.000000,"
        + "10.000000 10.000000,"
        + "0.000000 10.000000,"
        + "0.000000 0.000000)";    var thingWithGeometry = thingRepository
       .findThingWithGeometry("A box");
    assertThat(thingWithGeometry).isNotNull()
        .extracting(ThingWithGeometry::getWkt)
        .isEqualTo(expectedWkt);
}

And with that, we just have tested a Spring Data Neo4j based Neo4j-application including custom plugins end-to-end. Starting with the plugins and ending up with an integration test.

Summary

When writing custom extensions, you want a quick feedback loop for all of your tests. You’re also very close to the server in all cases. The test-harness provides you with the fastest feedback loop possible and doesn’t expose your code to more than you actually need. Your code is right there at the server level. The test-harness and the embedded, customizable instance of Neo4j should be your first choice when testing custom Neo4j extensions. It is also a good choice for infrastructure code like Neo4j-OGM and Spring Data Neo4j itself. Neo4j-OGM runs against an embedded graph, over Bolt and HTTP, so it must be tested against all of those. The test-harness provides good support for that.

The main advantage of using a Testcontainer is the fact that it resembles your later application setup the best. While there are a few use-cases, most applications should not run an embedded version of Neo4j. Think about it: In a microservices world, where you have usually more than one instance of an application running, should each instance bring it’s own database? You cannot run Neo4j in Causal Cluster mode in an embedded scenario, so you have to synchronize those instances.
Furthermore: If your application goes down, so would your database.

The generic Testcontainer or the dedicated Neo4j-Testcontainer gives an easy way to bring up new, clean database instances for each test. Thus, your tests are independent of each other and you won’t have interference in your test data from concurrent tests.

So please keep the following in mind while your design your integration tests:

The topology of your test should resemble your target topology as much as possible
Try to use a dataset for integration tests, that is comparable in size to your production dataset

Testcontainers help a lot to achieve the first item. Whether you can get your hands on a dataset that is similar to your production data set, depends probably on your surroundings and organization. If it is possible however, you could create a custom Neo4j Docker image and use that one as a basis for the Testcontainer in your CI.

Images:

Harness: https://unsplash.com/photos/N_3CHNdliVs
Logo Testcontainers: https://testcontainers.org