Multi-Tenant GraphQL With Neo4j 4.0
A Look At Using Neo4j 4.0 Multidatabase With neo4j-graphql.js
Neo4j 4.0 introduces support for multiple active databases, which can enable use cases like multitenancy where we have one database per tenant, ensuring each tenant’s data is kept separate.
In this post we take a look at using multiple databases in Neo4j 4.0, then show how to use multiple active databases with the Neo4j GraphQL integration neo4j-graphql.js, building a multi-tenant GraphQL API.
Creating Multiple Active Databases
There are several ways to approach multitenancy and different ways to use multiple databases in Neo4j. In this example, we’ll work with data about businesses and user reviews. We’ll define a “tenant” based on the city in which the business is located, storing information about businesses in different cities in different databases.
First, we’ll create a new Neo4j 4.0 graph in Neo4j Desktop. Multiple active database support is new in Neo4j 4.0 so be sure you have the latest version to try it out.
There are two databases created when you first start Neo4j: system
and neo4j
. The system database contains metadata about the database management system, and we use it to perform administrative functions like creating new databases, users, and security configuration. When we open Neo4j Browser for the first time by default we connect to the neo4j
database. The :use
command allows us to switch between databases.
Let’s connect to the system
database and create two new databases, one named sanmateo
and the other named missoula
. First, we switch to the system
database:
:use system
Then execute two CREATE DATABASE
commands:
CREATE DATABASE sanmateo;CREATE DATABASE missoula;
We can verify our new databases have been created by running SHOW DATABASES
against the system database:
We can see the new databases we’ve created, as well as the system
and neo4j
databases. In addition, we can see that neo4j
is the default database. Let’s delete the neo4j
database:
DROP DATABASE neo4j;
And now let’s set sanmateo
as our default database by setting dbms.default_database=sanmateo
in the neo4j.conf
configuration file. We can verify our changes by running SHOW DATABASES
again:
Loading Our Business Reviews Data
Now let’s load some data into our two databases. Our data contains businesses, categories, reviews, and the users who have written the reviews. We’ll load data from a CSV file, filtering by city to make sure we load the businesses in the appropriate database.
First, we’ll switch to the sanmateo
database:
:use sanmateo
Then run a Cypher statement to read a CSV and create our data in the database, filtering for only businesses in San Mateo:
LOAD CSV WITH HEADERS FROM "https://cdn.neo4jlabs.com/data/grandstack_businesses.csv" AS row
WITH row WHERE row.businessCity = "San Mateo"
MERGE (b:Business {businessId: row.businessId})
ON CREATE SET b.name = row.businessName,
b.city = row.businessCity,
b.state = row.businessState,
b.address = row.businessAddress,
b.location = Point({latitude: toFloat(row.latitude), longitude: toFloat(row.longitude)})MERGE (u:User {userId: row.userId})
ON CREATE SET u.name = row.userNameMERGE (r:Review {reviewId: row.reviewId})
ON CREATE SET r.text = row.reviewText,
r.stars = toFloat(row.reviewStars),
r.date = date(row.reviewDate)MERGE (u)-[:WROTE]->(r)
MERGE (r)-[:REVIEWS]->(b)WITH *UNWIND split(row.categories, ",") AS cat
MERGE (c:Category {name: cat})
MERGE (c)<-[:IN_CATEGORY]-(b)
Next, we do the same for our Missoula businesses, first switching to the missoula
database:
:use missoula
Then we run the same Cypher statement to load data for Missoula businesses, but we change the second line to filter for rows in the CSV file only for Missoula businesses:
LOAD CSV WITH HEADERS FROM "https://cdn.neo4jlabs.com/data/grandstack_businesses.csv" AS row
WITH row WHERE row.businessCity = "Missoula"
MERGE (b:Business {businessId: row.businessId})
ON CREATE SET b.name = row.businessName,
b.city = row.businessCity,
b.state = row.businessState,
b.address = row.businessAddress,
b.location = Point({latitude: toFloat(row.latitude), longitude: toFloat(row.longitude)})MERGE (u:User {userId: row.userId})
ON CREATE SET u.name = row.userNameMERGE (r:Review {reviewId: row.reviewId})
ON CREATE SET r.text = row.reviewText,
r.stars = toFloat(row.reviewStars),
r.date = date(row.reviewDate)MERGE (u)-[:WROTE]->(r)
MERGE (r)-[:REVIEWS]->(b)WITH *UNWIND split(row.categories, ",") AS cat
MERGE (c:Category {name: cat})
MERGE (c)<-[:IN_CATEGORY]-(b)
Query With Drivers
Now we’re ready to query Neo4j. With multi-database in Neo4j we specify the database to use when we construct a session object from the Neo4j driver instance. We can then use this session object to run queries against the specified database. For example:
const neo4j = require('neo4j-driver');const driver = neo4j.driver(
'neo4j://localhost:7687',
neo4j.auth.basic('neo4j', 'letmein')
);const session = context.driver.session({
database: "sanmateo"
});session.run('MATCH (b:Business) RETURN count(b) AS count')
.then(res => console.log(res.records[0].get('count')))
Using Multidatabase With GraphQL
Next, we want to build a GraphQL API which will expose our data using neo4j-graphql.js, taking our multitenancy use case in mind. We have a few options of how we can expose our multi-tenant graph via GraphQL and will take a look at each approach:
- Just use the default database
- Create separate GraphQL endpoints for each tenant
- Create one GraphQL endpoint that can access both databases
Setup
First, creating our GraphQL schema and Neo4j driver instance is the same for each approach. We define our GraphQL type definitions and then use neo4j-graphql.js to generate a GraphQL API:
const { makeAugmentedSchema } = require('neo4j-graphql.js');
const { ApolloServer } = require('apollo-server');
const neo4j = require('neo4j-driver');const typeDefs = `
type User {
name: String!
wrote: [Review] @relation(name: "WROTE", direction: "OUT")
}type Review {
date: Date!
reviewId: String!
stars: Float!
text: String
reviews: [Business] @relation(name: "REVIEWS", direction: "OUT")
users: [User] @relation(name: "WROTE", direction: "IN")
}type Category {
name: String!
business: [Business] @relation(name: "IN_CATEGORY", direction: "IN")
}type Business {
address: String!
city: String!
location: Point!
name: String!
state: String!
in_category: [Category] @relation(name: "IN_CATEGORY", direction: "OUT")
reviews: [Review] @relation(name: "REVIEWS", direction: "IN")
}
`;const schema = makeAugmentedSchema({ typeDefs });const driver = neo4j.driver(
'neo4j://localhost:7687',
neo4j.auth.basic('neo4j', 'letmein')
);
Multiple databases can be used with neo4j-graphql.js by specifying a value in the GraphQL resolver context. If no value is specified for context.neo4jDatabase
then the default database is used.
Using The Default Database
If we don’t specify a database then the default database (as specified in neo4j.conf) will be used when we execute our GraphQL queries. This isn’t helpful for our current multi-tenant use case but ensures we have a predictable fallback if we don’t specify a database in the context object:
const server = new ApolloServer({ schema, context: { driver } });server.listen(3003, '0.0.0.0').then(({ url }) => {
console.log(`GraphQL API ready at ${url}`);
});
Create Two GraphQL Endpoints
The next option is to create separate GraphQL endpoints for each tenant. In this example, our client application would be responsible for knowing which GraphQL endpoint to connect to.
We use Apollo Server to serve the GraphQL schema generated by neo4j-graphql.js. The Neo4j database to be used is specified in the GraphQL resolver context object. The context object is passed to each resolver and neo4j-graphql.js at a minimum expects a Neo4j JavaScript driver instance under the driver
key.
To specify the Neo4j database to be used, provide a value in the context object, under the key neo4jDatabase
that evaluates to a string representing the desired database. If no value is provided then the default Neo4j database will be used.
Here we run a GraphQL endpoint for our sanmateo
database on port 3003 and a separate GraphQL endpoint for the missoula
database on port 3004:
const sanmateoServer = new ApolloServer({
schema,
context: { driver, neo4jDatabase: 'sanmateo' }
});sanmateoServer.listen(3003, '0.0.0.0').then(({ url }) => {
console.log(`San Mateo GraphQL API ready at ${url}`);
});const missoulaServer = new ApolloServer({
schema,
context: { driver, neo4jDatabase: 'missoula' }
});missoulaServer.listen(3004, '0.0.0.0').then(({ url }) => {
console.log(`Missoula GraphQL API ready at ${url}`);
});
Create One GraphQL Endpoint
We can also use a function to define the context object. This allows us to use a value from the request header or some middleware process to specify the Neo4j database.
Here we use the value of the request header x-database
for the Neo4j database:
const server = new ApolloServer({
schema,
context: ({req}) => {
return {driver, neo4jDatabase: req.headers['x-database']}
}
})server.listen(3003, '0.0.0.0').then(({ url }) => {
console.log(`GraphQL API ready at ${url}`);
});
By specifying the Neo4j database to be used in the context object we have the flexibility to implement multi-tenancy as we best see fit for our application, whether that is by creating multiple GraphQL endpoints, using request headers, or using middleware to determine the database at query time.
Resources
You can read more about using multi-database with Neo4j in the Neo4j manual. See the docs for more information about neo4j-graphql.js, including multi-database support.
The code for this GraphQL example can be found on Github here.
If you found this interesting, subscribe to the GRANDstack mailing list to be kept up to date on all things GRANDstack: