Optimising Cloudant for serverless

Getting the best performance out of Cloudant when using it with a Functions-as-a-service platform such as IBM Cloud Functions

IBM Cloudant is a great fit for serverless applications — build a JSON document in the schema of your choice and post it to a Cloudant database using the HTTP API. Both IBM Cloud Functions and IBM Cloudant can scale to deal with your application’s workload without you worrying about operating systems, machine reboots, queues, load balancing, networking etc.

Building a serverless application makes you think differently about how to optimise for performance:

  • each incoming event is handled separately, so you cannot minimise HTTP requests to the database by bundling multiple requests into a single bulk operation.
  • slow performance of your action results in increased bills because you pay per millisecond of execution time.
  • without using the Composer tooling to allow state to be retained between actions in sequences, each action starts with no state other than the incoming data and any parameters that were configured at deploy-time.

Let’s take a look at a very simple Cloud Function that writes some data to Cloudant.

How Cloud Functions talks to Cloudant

We’ll use a Node.js runtime with the latest version of the official Cloudant Node.js library.

In a blank directory we can create a new npm project:

Create a new JavaScript project by answering a few questions

and install the library

Install the official Cloudant library

Then we write our own code into a index.js file:

A first stab at a serverless function that writes to Cloudant

The above Node.js action simply writes a document to Cloudant when it is invoked. It is deployed with:

Command-line deployment

The credentials of your Cloudant service and database name are baked-in to the action as parameters — you’ll need to replace "https://USER:PASS@HOST.cloudant.com" with your own Cloudant service's admin credentials if you want to run this yourself.

The action is invoked from the command-line with:

The Cloudant library translates your call to db.insert into an HTTP POST passing a new JSON object to Cloudant. You should see the id and rev of the newly created document output in the terminal.

What’s happening under the hood?

When our action is invoked the Cloudant library is initialized and has some work to do:

  • do a DNS lookup to translate your Cloudant hostname into an IP address.
  • make a secure HTTPS connection to the Cloudant service which involves performing a TLS handshake
  • authenticate against the Cloudant service by exchanging your username & password for a session cookie.

The library handles all of this for us but nevertheless, it is an overhead. Our action is making two HTTP requests in series for each invocation of the cloud function.

You should be concerned about this because you are being charged for the execution time of each invocation!

Re-using the connection

To avoid making an authentication request each time your action is invoked, we can make use of some inside knowledge of how Cloud Functions works. When you deploy your action, the Cloud Functions platform turns your code into a container. The platform re-uses the same container again and again when invocations happen often, retaining the code’s global variable space. We can use this to our advantage to reuse the Cloudant connection.

If we store our Cloudant object in a global variable (as opposed to the local variable we are using now), our second and subsequent invocations will be able to re-use its data, which includes the authentication cookie. Our code now looks like this:

Optimised version of the code, using a global variable to store “db”, the Cloudant database object
  • there is a global variable called db which is to hold the Cloudant library object
  • when the main function is called for the first time, the db object is created with the Cloudant configuration. An if statement ensures that the db object is only created once.
  • on the first write to the database, the Cloudant library will first exchange its credentials for a cookie, storing it in the db object.
  • subsequent invocations that reuse the same container will re-use the cookie inside db, and if consequitive innvocations are close enough in time, they will also reuse the same HTTP connection, because the socket is kept alive.

The sample principle can be applied to code using the iam plugin, IBM Cloud’s Identity and Access Management system for authentication.

I thought global variables were bad?

They are in general, but in this case they allow state to be retained between invocations of our Cloud Function, making our application faster and saving us money!