How to Automate Neo4j Deploys on Google Cloud Platform (GCP)

Published in

Neo4j Developer Blog

6 min readFeb 21, 2019

EDIT MAY 2020: Neo4j official documentation has published updated versions of this documentation.

— — — — — — — —

Neo4j already provides some documentation on its site for how to do deployments of Neo4j to common clouds, including Google Cloud. But in this article, I’ll provide sample shell scripts that can do this automatically for you.

These are useful when you want to integrate Neo4j into your CI/CD pipeline and be able to create/destroy instances temporarily, and also just to spin up a sample instance. But really, if you can automate Neo4j deployment, then any other piece of software can make a Neo4j instance whenever it needs, which is extremely handy.

If you have any questions, feedback, or want to discuss drop by this thread on the Neo4j Community site!

Neo4j and Google Cloud Deployment Manager

Requirements

Before we begin, you’ll need the gcloud command line interface program, which you can download and install with directions here. The gcloud CLI is the main way you can automate all things with GCP.

It will also be necessary to authenticate your gcloud CLI, to make sure it can interact with your GCP projects.

Google Cloud Deployment Manager

Neo4j provides Deployment Manager templates for Neo4j Causal Cluster (highly available clusters), and VM images for Neo4j Enterprise stand-alone. So first thing’s first, pick which one you would like to deploy. We’ll cover both in this article.

Deployment Manager is really just a recipe for GCP that tells it how to deploy a whole set of interrelated resources. By deploying all of this as a stack we can keep all of our resources together, and delete just one thing when we’re done.

Approach

How to deploy a cluster is very simple: we just submit a new Deployment Manager job, pointing to the right template URL to tell GCP what to deploy. We then will provide various parameters to control how much hardware we’re using and so on.

We’ll need to specify several common parameters, which you’ll see in the scripts below. Here’s an explanation of what they are.

Machine Type: This is the GCP machine type you want to launch, which controls how much hardware you’re giving the database.
Boot Disk Size/Type: you can use these parameters to control whether Neo4j uses standard spinning magnetic platters (pd-standard) or SSD disks (pd-ssd) as well as how many GB of storage you want to allocate. Note that with some disk sizes, GCP will warn that the root partition type may need to be resized if the underlying OS does not support the disk size. This warning can be ignored, because the underlying OS will recognize any size disk you give it without trouble.
Zone: Where in the world you want to deploy Neo4j. The template supports any GCP zone, but in our example we’ll use us-east1-b.
Project: This simply specifies which project ID you want to deploy into within GCP.

Let’s get started. Simply run any of these scripts, and it will result in a Deployment Manger stack being deployed.

Deploying Neo4j Enterprise Causal Cluster

In addition to the parameters listed above, because this is a clustered deploy, take note that we’re also using a parameter for “Cores” and “Read Replicas” to control how many nodes are in our cluster.

#!/bin/bashexport NAME=neo4j-cluster
PROJECT=my-gcp-project-ID
MACHINE=n1-standard-2
DISK_TYPE=pd-ssd
DISK_SIZE=64
ZONE=us-east1-b
CORES=3
READ_REPLICAS=0
NEO4J_VERSION=3.5.3TEMPLATE_URL=https://storage.googleapis.com/neo4j-deploy/$NEO4J_VERSION/causal-cluster/neo4j-causal-cluster.jinjaOUTPUT=$(gcloud deployment-manager deployments create $NAME \
   --project $PROJECT \
   --template "$TEMPLATE_URL" \
   --properties "zone:'$ZONE',clusterNodes:'$CORES',readReplicas:'$READ_REPLICAS',bootDiskSizeGb:$DISK_SIZE,bootDiskType:'$DISK_TYPE',machineType:'$MACHINE'")echo $OUTPUTPASSWORD=$(echo $OUTPUT | perl -ne 'm/password\s+([^\s]+)/; print $1;')
IP=$(echo $OUTPUT | perl -ne 'm/vm1URL\s+https:\/\/([^\s]+):/; print $1; ')
echo NEO4J_URI=bolt+routing://$IP
echo NEO4J_PASSWORD=$PASSWORD
echo STACK_NAME=$NAME

After all of the setup where we declare parameters of what we’re deploying, the heart of the entire script is just one simple call to gcloud deployment-manager deployments create which does all of the work.

We capture the result in the variable OUTPUT, which contains a lot of text telling us about our deployment. We then process that with a little bit of perl to pull out the password and IP address of our new deployment, because it will have a strong randomly assigned password.

What is nice about the Google deployment process is that this command will block and not succeed until the entire stack has been deployed and is ready. This means by the time you get that IP address back, you’re ready to go, and don’t need to check if Neo4j is up, because it is!

If you lose these stack outputs (IP, password and so on) they will also appear in your Deployment Manager window within the GCP console, so you can refer back to them.

To delete a deployment created in this way, you just need to take note of the STACK_NAME that we deployed. I use a short script to delete deployments like this:

#!/bin/bashPROJECT=my-google-project-idif [ -z $1 ] ; then
   echo "Usage: call me with deployment name"
   exit 1
figcloud -q deployment-manager deployments delete $1 --project $PROJECT# OPTIONAL!  Destroy the disk
# gcloud --quiet compute disks delete $(gcloud compute disks list --project $PROJECT --filter="name~'$1'" --uri)

When you delete Neo4j stacks on GCP, they intentionally leave their GCP disks behind, to make it hard for you to accidentally destroy your valuable data. But because these disks are left behind, you may wish to uncomment that last line, which will clean up those disks if the deploy is truly temporary and the disks aren’t wanted.

Deploying Neo4j Enterprise (or Community) Stand Alone

This will create a single instance of Neo4j without high-availability failover capabilities, but it’s a very fast way to get started. For this deploy, we don’t use Deployment Manager but just create a simple VM and configure its firewall/security rules.

Because we’re not using Deployment Manager for this one, this also provides an example of polling and waiting until the VM service comes up, and then changing the Neo4j default password when it does. You’ll notice at the top of the script we choose a random password by running some random bytes through a hash.

The launcher-public project on GCP hosts Neo4j’s VM images for GCP. In this example we’re using neo4j-enterprise-1–3–5–3-apoc, but other versions are available too. By substituting a different image name here, you can use this same technique to run Neo4j Community.

#!/bin/bashexport PROJECT=my-gcp-project-id
export MACHINE=n1-standard-2
export DISK_TYPE=pd-ssd
export DISK_SIZE=64GB
export ZONE=us-east1-b
export NEO4J_VERSION=3.5.3
export PASSWORD=$(head -n 20 /dev/urandom | md5)
export STACK_NAME=neo4j-standalone
export IMAGE=neo4j-enterprise-1-3-5-3-apoc# Setup firewalling.
echo "Creating firewall rules"
gcloud compute firewall-rules create "$STACK_NAME" \
    --allow tcp:7473,tcp:7687 \
    --source-ranges 0.0.0.0/0 \
    --target-tags neo4j \
    --project $PROJECTif [ $? -ne 0 ] ; then
   echo "Firewall creation failed.  Bailing out"
   exit 1
fiecho "Creating instance"OUTPUT=$(gcloud compute instances create $STACK_NAME \
   --project $PROJECT \
   --image $IMAGE \
   --tags neo4j \
   --machine-type $MACHINE \
   --boot-disk-size $DISK_SIZE \
   --boot-disk-type $DISK_TYPE \
   --image-project launcher-public)echo $OUTPUT# Pull out the IP addresses, and toss out the private internal one (10.*)IP=$(echo $OUTPUT | grep -oE '((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])' | grep --invert-match "^10\.")echo "Discovered new machine IP at $IP"tries=0while true ; do
   OUTPUT=$(echo "CALL dbms.changePassword('$PASSWORD');" | cypher-shell -a $IP -u neo4j -p "neo4j" 2>&1)
   EC=$?   echo $OUTPUT
   
   if [ $EC -eq 0 ]; then 
     echo "Machine is up ... $tries tries"
   break
fi  if [ $tries -gt 30 ] ; then
    echo STACK_NAME=$STACK_NAME
    echo "Machine is not coming up, giving up"
    exit 1
  fi  tries=$(($tries+1))  echo "Machine is not up yet ... $tries tries"  sleep 1;
doneecho NEO4J_URI=bolt://$IP:7687
echo NEO4J_PASSWORD=$PASSWORD
echo STACK_NAME=$STACK_NAME
exit 0

To delete an instance created like this, again we take note of the STACK_NAME and just use another utility script:

#!/bin/bashexport PROJECT=my-google-project-idif [ -z $1 ] ; then
   echo "Missing argument"
   exit 1
fiecho "Deleting instance and firewall rules"gcloud compute instances delete --quiet "$1" --project "$PROJECT" && gcloud compute firewall-rules --quiet delete "$1" --project "$PROJECT"exit $?