How We Built a Serverless Backend Using GraalVM, AWS Lambda and Astra DB (Part 1)

Authors: ,

When the pandemic started, we set ourselves a learning goal: develop a backend using only serverless technologies. Initially, we set out to make this happen using technologies we were already familiar with, AWS Lambda and Java. But to spice things up, we decided to add some new technologies to the mix — GraalVM to eliminate the , and as our serverless DBaaS.

So, we spent a couple of hours per week on building a serverless order processing API using Astra DB and AWS. Due to the pandemic, you could call it a distributed “hackathon” of sorts, in which we had three main challenges:

  • Access Astra DB from within AWS Lambda
  • Write automatic tests for our Astra DB client
  • Set up the Lambda function to use the GraalVM native image runtime

In this first post, we will walk through the first two challenges and the technologies that helped us on the way, mainly and . In the second post, we are going to dive into how we put our serverless API in the cloud using AWS API Gateway, AWS Lambda and . First, let’s take a look at the high level architecture.

Architecture

To give you a better understanding of what we were going for, here’s a (rather simple) overview of the target architecture.

Figure 1: Illustration of the target architecture for this project.

Our end user accesses the API through an AWS API Gateway which is wired to our AWS Lambda function. The Lambda function in turn accesses the Astra DB Document API which is internally provided by Stargate.

is a fully managed API service to create, publish, maintain, monitor, and secure APIs at any scale. Those APIs can be connected to a large number of different backend services.

offers managed functions as a service (FaaS) based on micro virtual machines. To create a Lambda function you provide the code to execute, e.g. a Python script or a Jar file. The function can be invoked on demand based on a variety of triggers.

is a multi-cloud database-as-a-service (DBaaS) based on ™ that eliminates the overhead of installing, operating, and scaling your own database installation. Essentially, Astra DB helps developers reduce deployment time, costs, and nightmares. Astra DB also equips you with a few data APIs to build applications faster, which leads us to our next big player — Stargate.

is an open source data gateway and the official data API for Astra DB. In short, it allows developers to connect to all their data with the APIs and tools they are used to. You can create tables and schemas and query data without learning Cassandra Query Language (CQL).

Now let’s take a closer look at the two goals we set ourselves for part one of this series.

Goals

Access Astra DB from Java

First of all, we had to figure out how to access Astra DB from within AWS Lambda with minimal dependencies. Lambda functions should be able to start as quickly as possible and we wanted to avoid bloating our with unnecessary dependencies.

Additionally, Lambda functions should be stateless, given that the runtime can be paused/frozen without notice for a longer period of time — or even destroyed completely. Although compared to other runtimes, such as Python, the even between executions. But this behavior should not be counted on. To keep things simple, we accessed the Document API via an Apache HTTP client.

Another problem with AWS Lambda is you cannot easily perform database migrations. You have limited control over when your function is executed and how many instances are created. Also, if you migrate the schema on start, whenever someone uses your API for the first time they have to wait for your schema migration to finish first. This is why using the Document API, which doesn’t require specifying a schema upfront, was our best bet for accessing Astra DB from AWS Lambda.

Test Astra DB client locally

Having the Java code to access Astra DB is great, but then how do we test it without spinning up an entire Cassandra cluster along with Stargate? Luckily, Stargate offers a , where the Stargate node behaves as a regular Cassandra node, joining the ring with tokens assigned to get started quickly without needing additional nodes or an existing cluster.

We can start a local Stargate node for our automated tests using . For the unfamiliar, Testcontainers is a Java library that provides lightweight, throwaway instances of common databases or anything that can run in a Docker container. This essentially makes it easier to run tests for things like data access layer integration, app integration, UI/acceptance, and more.

Getting into the code

The main functionality of our fictional API is to manage orders for an online shop. We need to save and retrieve orders. The class AstraClient encapsulates this functionality in the methods saveOrder and getOrder, respectively. Those methods interact with the document API.

To access our orders collection, we need to pass the Astra DB base URL, the access credentials, as well as the (also known as “keyspace” in the Cassandra realm).

Next, we implement a simple test case that saves and then retrieves an order. For this, we create a new test class AstraClientTest annotated with @Testcontainers for the Testcontainers framework to manage the @Container lifecycle. We also implement a small test extension that manages namespace and token creation and provides our test class with an AstraClient instance.

Now, let’s dive into the stargate container definition. We start it in developer mode to act as a DB node. We also use , since we do not need a particularly sophisticated snitch functionality.

By default, Stargate starts a CQL service on port 9042, a REST auth service for generating tokens on 8081, and an HTTP interface on port 8082. Since we used the Document API, we do not need to expose the CQL port.

Next, we implement a test method that persists and subsequently retrieves an order in shouldPersistAndRetrieveOrder. Our test extension generates a client that points to our Stargate container and has working credentials. We then use that to call saveOrder and getOrder in succession, validating that the retrieved order matches the originally stored one.

Before we dig into the details of the test extension, let’s cover the missing AstraClient functionality. To save and retrieve orders, we need a data class containing order data (let’s call it Order). To persist an order, we submit an HTTP POST request to the orders collection endpoint with the order JSON as payload. The response object contains the newly created document ID which we can use as our order ID.

To retrieve an order, we submit an HTTP GET request to the document ID resource inside the orders collection. Our order will be wrapped inside a JSON object that contains the actual order in the data field. We model this wrapper in the OrderDocument class. The getOrder method returns an Optional<Order> which is empty in case the order doesn’t exist.

At this point we can run our test and validate that the implemented functionality meets expectations. Now let’s take a look at the test extension. The following listing presents an outline of the class — it implements the BeforeEachCallback interface which tells JUnit to execute the beforeEach method before each test execution.

In beforeEach we first generate an . To do this, we call the Stargate auth endpoint and post the username and password via HTTP which then returns the auth token.

After passing the auth token to our AstraClient, we ensure the namespace (aka keyspace in Cassandra) exists. The production code assumes that the keyspace exists since we create it as part of our infrastructure provisioning code using the . In the test case we simply create the namespace via HTTP.

With that we conclude the code for this part of the “hackathon.” So far we’ve successfully covered our first two goals: we implemented an AstraClient that uses the Astra DB Document API to store and retrieve orders. Then we tested our code using a custom JUnit 5 test extension along with the Testcontainers framework.

What’s next?

In the second part of this series we will show you how we implemented an AWS Lambda handler that accepts HTTP requests from AWS API Gateway, transforms them into Astra DB requests using our AstraClient class, and returns a response to the user. The handler is written to run in a GraalVM native runtime which minimizes those pesky cold start issues we always bumped into with the default Java runtime.

Stay tuned for the next post to continue our tour of the technologies, challenges, and workarounds involved in getting our serverless API into production!

In the meantime, you can poke around the for this project in GitHub. If you have any questions or want to know more about this project, head over to the and we’ll meet you there. To reach one of us in particular you can find us on Twitter and .

Follow to get notified of new posts on all things data, cloud-native, and open source. To join a buzzing community of developers from around the world, follow DataStaxDevs on and .

Resources

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
DataStax

DataStax is the company behind the massively scalable, highly available, cloud-native NoSQL data platform built on Apache Cassandra®.