Multi-Tenant GraphQL With Neo4j 4.0

A Look At Using Neo4j 4.0 Multidatabase With neo4j-graphql.js

Published in

GRANDstack - GraphQL, React, Apollo, Neo4j Database

7 min readFeb 4, 2020

Neo4j 4.0 introduces support for multiple active databases, which can enable use cases like multitenancy where we have one database per tenant, ensuring each tenant’s data is kept separate.

In this post we take a look at using multiple databases in Neo4j 4.0, then show how to use multiple active databases with the Neo4j GraphQL integration neo4j-graphql.js, building a multi-tenant GraphQL API.

Creating Multiple Active Databases

There are several ways to approach multitenancy and different ways to use multiple databases in Neo4j. In this example, we’ll work with data about businesses and user reviews. We’ll define a “tenant” based on the city in which the business is located, storing information about businesses in different cities in different databases.

To follow along be sure you are using Neo4j 4.0 for multi-database support. Here we create a new graph in Neo4j Desktop.

First, we’ll create a new Neo4j 4.0 graph in Neo4j Desktop. Multiple active database support is new in Neo4j 4.0 so be sure you have the latest version to try it out.

There are two databases created when you first start Neo4j: system and neo4j. The system database contains metadata about the database management system, and we use it to perform administrative functions like creating new databases, users, and security configuration. When we open Neo4j Browser for the first time by default we connect to the neo4j database. The :use command allows us to switch between databases.

Let’s connect to the system database and create two new databases, one named sanmateo and the other named missoula. First, we switch to the system database:

:use system

Then execute two CREATE DATABASE commands:

CREATE DATABASE sanmateo;CREATE DATABASE missoula;

We can verify our new databases have been created by running SHOW DATABASES against the system database:

We can see the new databases we’ve created, as well as the system and neo4j databases. In addition, we can see that neo4j is the default database. Let’s delete the neo4j database:

DROP DATABASE neo4j;

And now let’s set sanmateo as our default database by setting dbms.default_database=sanmateo in the neo4j.conf configuration file. We can verify our changes by running SHOW DATABASES again:

Loading Our Business Reviews Data

Now let’s load some data into our two databases. Our data contains businesses, categories, reviews, and the users who have written the reviews. We’ll load data from a CSV file, filtering by city to make sure we load the businesses in the appropriate database.

First, we’ll switch to the sanmateo database:

:use sanmateo

Then run a Cypher statement to read a CSV and create our data in the database, filtering for only businesses in San Mateo:

LOAD CSV WITH HEADERS FROM "https://cdn.neo4jlabs.com/data/grandstack_businesses.csv" AS row
WITH row WHERE row.businessCity = "San Mateo"
MERGE (b:Business {businessId: row.businessId})
  ON CREATE SET b.name     = row.businessName,
                b.city     = row.businessCity,
                b.state    = row.businessState,
                b.address  = row.businessAddress,
                b.location = Point({latitude: toFloat(row.latitude), longitude: toFloat(row.longitude)})MERGE (u:User {userId: row.userId})
  ON CREATE SET u.name = row.userNameMERGE (r:Review {reviewId: row.reviewId})
  ON CREATE SET r.text  = row.reviewText,
                r.stars = toFloat(row.reviewStars),
                r.date  = date(row.reviewDate)MERGE (u)-[:WROTE]->(r)
MERGE (r)-[:REVIEWS]->(b)WITH *UNWIND split(row.categories, ",") AS cat
  MERGE (c:Category {name: cat})
  MERGE (c)<-[:IN_CATEGORY]-(b)

Next, we do the same for our Missoula businesses, first switching to the missoula database:

:use missoula

Then we run the same Cypher statement to load data for Missoula businesses, but we change the second line to filter for rows in the CSV file only for Missoula businesses:

LOAD CSV WITH HEADERS FROM "https://cdn.neo4jlabs.com/data/grandstack_businesses.csv" AS row
WITH row WHERE row.businessCity = "Missoula"
MERGE (b:Business {businessId: row.businessId})
  ON CREATE SET b.name     = row.businessName,
                b.city     = row.businessCity,
                b.state    = row.businessState,
                b.address  = row.businessAddress,
                b.location = Point({latitude: toFloat(row.latitude), longitude: toFloat(row.longitude)})MERGE (u:User {userId: row.userId})
  ON CREATE SET u.name = row.userNameMERGE (r:Review {reviewId: row.reviewId})
  ON CREATE SET r.text  = row.reviewText,
                r.stars = toFloat(row.reviewStars),
                r.date  = date(row.reviewDate)MERGE (u)-[:WROTE]->(r)
MERGE (r)-[:REVIEWS]->(b)WITH *UNWIND split(row.categories, ",") AS cat
  MERGE (c:Category {name: cat})
  MERGE (c)<-[:IN_CATEGORY]-(b)

Query With Drivers

Now we’re ready to query Neo4j. With multi-database in Neo4j we specify the database to use when we construct a session object from the Neo4j driver instance. We can then use this session object to run queries against the specified database. For example:

const neo4j = require('neo4j-driver');const driver = neo4j.driver(
  'neo4j://localhost:7687',
  neo4j.auth.basic('neo4j', 'letmein')
);const session = context.driver.session({        
  database: "sanmateo"      
});session.run('MATCH (b:Business) RETURN count(b) AS count')
    .then(res => console.log(res.records[0].get('count')))

Using Multidatabase With GraphQL

Next, we want to build a GraphQL API which will expose our data using neo4j-graphql.js, taking our multitenancy use case in mind. We have a few options of how we can expose our multi-tenant graph via GraphQL and will take a look at each approach:

Just use the default database
Create separate GraphQL endpoints for each tenant
Create one GraphQL endpoint that can access both databases

Setup

First, creating our GraphQL schema and Neo4j driver instance is the same for each approach. We define our GraphQL type definitions and then use neo4j-graphql.js to generate a GraphQL API:

const { makeAugmentedSchema } = require('neo4j-graphql.js');
const { ApolloServer } = require('apollo-server');
const neo4j = require('neo4j-driver');const typeDefs = `
type User {
  name: String!
  wrote: [Review] @relation(name: "WROTE", direction: "OUT")
}type Review {
  date: Date!
  reviewId: String!
  stars: Float!
  text: String
  reviews: [Business] @relation(name: "REVIEWS", direction: "OUT")
  users: [User] @relation(name: "WROTE", direction: "IN")
}type Category {
  name: String!
  business: [Business] @relation(name: "IN_CATEGORY", direction: "IN")
}type Business {
  address: String!
  city: String!
  location: Point!
  name: String!
  state: String!
  in_category: [Category] @relation(name: "IN_CATEGORY", direction: "OUT")
  reviews: [Review] @relation(name: "REVIEWS", direction: "IN")
}
`;const schema = makeAugmentedSchema({ typeDefs });const driver = neo4j.driver(
  'neo4j://localhost:7687',
  neo4j.auth.basic('neo4j', 'letmein')
);

Multiple databases can be used with neo4j-graphql.js by specifying a value in the GraphQL resolver context. If no value is specified for context.neo4jDatabase then the default database is used.

Using The Default Database

If we don’t specify a database then the default database (as specified in neo4j.conf) will be used when we execute our GraphQL queries. This isn’t helpful for our current multi-tenant use case but ensures we have a predictable fallback if we don’t specify a database in the context object:

const server = new ApolloServer({ schema, context: { driver } });server.listen(3003, '0.0.0.0').then(({ url }) => {
  console.log(`GraphQL API ready at ${url}`);
});

Create Two GraphQL Endpoints

The next option is to create separate GraphQL endpoints for each tenant. In this example, our client application would be responsible for knowing which GraphQL endpoint to connect to.

We use Apollo Server to serve the GraphQL schema generated by neo4j-graphql.js. The Neo4j database to be used is specified in the GraphQL resolver context object. The context object is passed to each resolver and neo4j-graphql.js at a minimum expects a Neo4j JavaScript driver instance under the driver key.

To specify the Neo4j database to be used, provide a value in the context object, under the key neo4jDatabase that evaluates to a string representing the desired database. If no value is provided then the default Neo4j database will be used.

Here we run a GraphQL endpoint for our sanmateo database on port 3003 and a separate GraphQL endpoint for the missoula database on port 3004:

const sanmateoServer = new ApolloServer({
  schema,
  context: { driver, neo4jDatabase: 'sanmateo' }
});sanmateoServer.listen(3003, '0.0.0.0').then(({ url }) => {
  console.log(`San Mateo GraphQL API ready at ${url}`);
});const missoulaServer = new ApolloServer({
  schema,
  context: { driver, neo4jDatabase: 'missoula' }
});missoulaServer.listen(3004, '0.0.0.0').then(({ url }) => {
  console.log(`Missoula GraphQL API ready at ${url}`);
});

Here we execute the same query against our separate databases.

Create One GraphQL Endpoint

We can also use a function to define the context object. This allows us to use a value from the request header or some middleware process to specify the Neo4j database.

Here we use the value of the request header x-database for the Neo4j database:

const server = new ApolloServer({
  schema,
  context: ({req}) => { 
    return {driver, neo4jDatabase: req.headers['x-database']}
  
  }
})server.listen(3003, '0.0.0.0').then(({ url }) => {
     console.log(`GraphQL API ready at ${url}`);
   });

By specifying the Neo4j database to be used in the context object we have the flexibility to implement multi-tenancy as we best see fit for our application, whether that is by creating multiple GraphQL endpoints, using request headers, or using middleware to determine the database at query time.

Resources

You can read more about using multi-database with Neo4j in the Neo4j manual. See the docs for more information about neo4j-graphql.js, including multi-database support.

The code for this GraphQL example can be found on Github here.

If you found this interesting, subscribe to the GRANDstack mailing list to be kept up to date on all things GRANDstack: