Build a Serverless API to Push Data to Neo4j with Cloud Functions

Jason Koo
10 min readJan 12, 2024

--

Video showing how to setup a Google Cloud Function using Google’s in-line editor.

Automating data workflows between APIs and databases is easier than ever thanks to no-code automation tools like Zapier, Make.com and n8n. Unfortunately, integrating these tools with Neo4j databases is a challenge because they don’t have a default endpoint to push data to.

In this article, we’ll detail how to build a Google Cloud Function to intake data from no-code tools and write it to a Neo4 database. With a custom Cloud Function, you can leverage the power and flexibility of Neo4j in your automated workflows!

Prerequisites:

  • A hosted Neo4j database
  • Familiarity with Python and command line tools
  • Ability to create a Google Cloud Function

We’ll cover:

  1. Writing the code to parse request data and write to Neo4j
  2. Make the code accessible via a Github repository*
  3. Setting up a Google Source Repository that references the repo*
  4. Setting up a Google Cloud Function triggerd by HTTP requests

*NOTE: Google Cloud Functions can reference code from a Google Source Repository but not Github/Gitlab/BitBucket directly. Steps 2 & 3 could be combined if you happen to use Source Repositories as your primary cloud-based repository.

Step 1 — The Code

For the following tutorial I’ll be using poetry to run a virtual environment and dependency management.

1a. First create a new folder named neo4j-uploader-gcf. Thencd into it from your terminal and run:

poetry init -n

This will add a pyproject.toml file to the folder using Poetry’s default options.

1b. Next add the Google’s Functions Framework and neo4j-uploader packages.

poetry add functions-framework neo4j-uploader

The Functions Framework is needed to run our code in Google Cloud and the neo4j-uploader is a package for uploading JSON payloads to a Neo4j Database.

In a previous article I detailed how to use the Neo4j Python driver to parse and push .json data into a Neo4j instance. Following that process a neo4j-uploader package is now publically available, we’ll make use it.

I use Visual Studio Code, but use your IDE of choice to add an emptymain.py file to the folder. The folder structure should now look like:

1c. Next, we can start coding up the main.py file. Add the following import statements:

from neo4j_uploader import batch_upload
import functions_framework
import os

The import os statement will allow us to load the Neo4j credentials from environment variables for both local testing and when deployed.

Now create a function with a decorator that Google’s Function Framework requires to notate entry points.

@functions_framework.http
def json_to_neo4j(request):

Retrieve the Neo4j credentials using the os module:

    uri = os.environ.get('NEO4J_URI', None)
user = os.environ.get('NEO4J_USERNAME', None)
password = os.environ.get('NEO4J_PASSWORD', None)

Then call the neo4j-uploader’s batch_upload function within a Try-Except block.

    # Forward the request to the neo4j-uploader
try:
upload_result = batch_upload(
config = {
'neo4j_uri': uri,
'neo4j_user': user,
'neo4j_password': password
},
data = request.get_json(silent=True),
)
# Using pydantic to convert the result object to a json object
return upload_result.model_dump(), 200, {"Content-Type": "application/json"}
except InvalidPayloadError as e:
# Missing or Invalid JSON payload
return f'JSON payload missing or malformed. {e}', 400
except Exception as e:
# Other neo4j exception or uploader error
return f'Problem uploading: {e}', 500

If the .json payload is missing or does not match the schema required by the batch_upload function, an InvalidPayloadError will be raised. If this occurs return a message and a HTTP 400 response error notifying the sender.

All other likely errors will be service related — invalid credentials, database connectivity/ down, etc. For these we’ll return a 500 error as these issues aren’t the sender’s fault.

The complete code:

import functions_framework
from neo4j_uploader import batch_upload, InvalidPayloadError
import os

@functions_framework.http
def json_to_neo4j(request):

# Validate config information is available
uri = os.environ.get('NEO4J_URI', None)
user = os.environ.get('NEO4J_USERNAME', None)
password = os.environ.get('NEO4J_PASSWORD', None)

# Forward the request to the neo4j-uploader
try:
upload_result = batch_upload(
config = {
'neo4j_uri': uri,
'neo4j_user': user,
'neo4j_password': password
},
data = request.get_json(silent=True),
)
return upload_result.model_dump(), 200, {"Content-Type": "application/json"}
except InvalidPayloadError as e:
# Missing or Invalid JSON payload
return f'JSON payload missing or malformed. {e}', 400
except Exception as e:
# Other neo4j exception or uploader error
return f'Problem uploading: {e}', 500

1d. Test Locally before deploying. We’ll need to emulate the environment variables that will be set up later within Google Cloud. To do this with poetry, add them prior to calling poetry run, like so:

NEO4J_URI=<uri> \
NEO4J_USERNAME=<username> \
NEO4J_PASSWORD=<password> \
poetry run functions-framework --target=json_to_neo4j

Replace any of the bracket <> placeholders with your Neo4j credentials.

Once the function is running, create a file named sample.json in the root project folder with the following sample data:

{
"nodes": [
{
"labels":["Person"],
"key":"uid",
"records":[
{
"uid":"abc",
"name": "John Wick"
}
]
},
{
"labels":["Dog"],
"key": "gid",
"records":[
{
"gid":"abc",
"name": "Daisy"
}
]
}
],
"relationships": [
{
"type":"LOVES",
"from_node": {
"record_key":"_from_uid",
"node_key":"uid",
"node_label":"Person"
},
"to_node": {
"record_key":"_to_gid",
"node_key":"gid",
"node_label": "Dog"
},
"records":[
{
"_from_uid":"abc",
"_to_gid":"abc"
}
]
}
]
}

This data specifies 2 nodes and a single relationship between them with the schema used by neo4_uploader’s batch_upload function. See the package’s documentation for more details.

Next run the following cURL command from your terminal:

curl -X POST https://localhost:8080 -H "Content-Type: application/json" -d @sample.json

This will send a POST request to your local running instance with the contents of the sample.json file as a payload. If successful you should see the following console output:

From the Neo4j Browser, the sample data that should now be in your database will render as:

Example output from Neo4j Desktop

👏 The code works and is all set 👏

Run the following Cypher command to clear your database, so we can test the deployed code later.

Step 2 — A Repo

Google Cloud Functions need a requirements.txt file to be in the root folder to define what dependencies to import. To automatically generate it with poetry, run the following in the command line:

poetry export -f requirements.txt --output requirements.txt

Include this file in a commit and push up to Github (or Gitlab/BitBucket) to be referenced by Google’s Source Repository service.

NOTE: A repo could be created directly in Google’s Source Repository like any other cloud-based repository. But that requires some initial set up. Most devs already have an active Github account so we’ll continue with this approach.

Step 3 — Google Source Repository

A new repository can be set up using either the gcloud CLI or the Cloud Console dashboard. I prefer the dashboard UI for one-off set ups, though there’s a lot of configuration pages and clicking to work through with this route.

3a. After logging into the dashboard for the first time. Click on the Get started button.

3b. A dialog will appear, click on the Create repositorybutton.

3c. Since our code is in a Github/Gitlab repo, select the Connect external repository option.

3d. Select Create project (or select a previously created Cloud project)

3e. Give your new project an easily identifiable name, then click CREATE.

3f. Once the creation process is complete, return to the earlier tab from 3c (https://source.cloud.google.com/repo/connect), reload the page, and select the project just created.

If this is the first time setting up a Google Cloud Function, a billing account will need to be setup. Click on the Setup Billing Account link and fill out the sequence of forms and credit card info.

3g. Once complete, return to this page (again) and select Github in the Git provider* field. Connect your Github account and select the repo you created from Step 2.

Both private and public repos should be referenceable. Once you’ve selected the correct repo, click on the Connect selected repository button.

Once the setup process is complete, the following dialog will appear with the name of your new source repository:

Now we’re ready to setup the actual cloud function.

Step 4 — Google Cloud Function

4a. From the Cloud Function dashboard, click on the upper-left hamburger menu, scroll down and select Cloud Functions

4b. In the Cloud Functions dashboard, click on the CREATE FUNCTION button.

4c. Select 1st gen for the Environment type, use any name for the Function name, and for now select Allow unauthenticated invocationsunder the Authentication radio options. Then click SAVE.

This will allow you to test and run the Cloud Function from the dashboard once we’re done. There’s still an extra permissions step before an external service can call this endpoint.

4d. To add the required Neo4j credentials as environment variables — expand the Runtime, build, connections and security section, scroll to the bottom and click ADD VARIABLE under the Runtime environment variables sub-section.

4e. Add your Neo4j credentials then click NEXT.

Non-working sample credentials in the Google Cloud Function configuration

4f. Configure the cloud function runtime by selecting Python 3.12 under Runtime and Cloud Source repository from the Source code dropdown.

NOTE: Steps 1d–3 could be skipped by selecting the Inline Editor option here and just pasting in the complete code from Step 1c. In practice I don’t prefer this method because the editor is not an IDE, you lose versioning, and testing development code this way is slower.

4g. Now fill-in the source repository info. Use the name of the function you created from Step 1c. in the Entry point field. Select the project created from Step 3e. in the Project ID field.

The Branch name field defaults to the old Git default of Master, but since October 1st, 2020 the default branch is Main. Change this to match the branch containing the code you uploaded in Step 2.

The TEST FUNCTION button unfortunately only runs for inline code. Since we tested locally already, we can skip this part and click on the DEPLOY button at the button of this configuration page.

After a minute, if successful a green check mark will appear next to the function name.

4h. To test the deployment, go to the TESTING section, add the same .json file used in Step 1c. Click on TEST THE FUNCTION to trigger the test command.

If successful, the following output and logs should display.

The same nodes and relationships should also appear in your Neo4j browser or workspace.

Example output from Neo4j Aura Workspace

To find the URL of your new Cloud Function endpoint, select the TRIGGER tab from your function details. The Trigger URL is the new endpoint.

4i. Despite having selected the Allow unauthenticated invocations back at Step 4c. Any POST call from an outside service will still receive a 403 Forbidden error in response.

To permit access, an authorization token needs to be generated and used by any services POSTing to the endpoint. The official Google Cloud docs details how to set this up.

If you want to go without authentication just to test, select the function’s PERMISSIONS tab and click on GRANT ACCESS.

Add allUser to the New Principals field, and select Cloud Functions Invoker in the Role field.

!!! WARNING !!!

Your cloud function can now be triggered by any service with your endpoint URL. Once you’re done testing, I highly recommend removing the allUsers permission and implementing one of Google’s prescribed Authenticate for invocation methods.

Maintenance & Modifications

The function will need to be manually edited and redeployed for any new code commits to take effect. To do this select the Cloud Function, click on theEDIT button, NEXT, then the DEPLOY buttons like when first setting up the function (but without needing to actually modify any configuration options). Deploying again this way will prompt the latest commit to be used.

To automatically update the function when a new commit is pushed into the target branch, a cloudbuild.yaml file will need to be added and a Cloud Build trigger created. See Google’s official docs on implementing this.

What’s Next

With the Google Cloud Function created, you can now automate data workflows from no-code tools directly to your Neo4j database.

To complete the integration, you’ll need to configure the data mapping in your no-code tool like Zapier or Make.com. Source data structures need to be converted to match the schema expected by the neo4j_uploader package.

In upcoming articles I’ll demonstrate the data mapping portion with some examples using these automation services.

--

--

Jason Koo

Developer Advocate at Neo4j, technophile and former iOS Developer.