Optimising Cloudant for serverless
Getting the best performance out of Cloudant when using it with a Functions-as-a-service platform such as IBM Cloud Functions
IBM Cloudant is a great fit for serverless applications — build a JSON document in the schema of your choice and post it to a Cloudant database using the HTTP API. Both IBM Cloud Functions and IBM Cloudant can scale to deal with your application’s workload without you worrying about operating systems, machine reboots, queues, load balancing, networking etc.
Building a serverless application makes you think differently about how to optimise for performance:
- each incoming event is handled separately, so you cannot minimise HTTP requests to the database by bundling multiple requests into a single bulk operation.
- slow performance of your action results in increased bills because you pay per millisecond of execution time.
- without using the Composer tooling to allow state to be retained between actions in sequences, each action starts with no state other than the incoming data and any parameters that were configured at deploy-time.
Let’s take a look at a very simple Cloud Function that writes some data to Cloudant.
How Cloud Functions talks to Cloudant
We’ll use a Node.js runtime with the latest version of the official Cloudant Node.js library.
In a blank directory we can create a new npm project:
and install the library
Then we write our own code into a
The above Node.js action simply writes a document to Cloudant when it is invoked. It is deployed with:
The credentials of your Cloudant service and database name are baked-in to the action as parameters — you’ll need to replace
"https://USER:PASS@HOST.cloudant.com" with your own Cloudant service's admin credentials if you want to run this yourself.
The action is invoked from the command-line with:
The Cloudant library translates your call to
db.insert into an HTTP POST passing a new JSON object to Cloudant. You should see the
rev of the newly created document output in the terminal.
What’s happening under the hood?
When our action is invoked the Cloudant library is initialized and has some work to do:
- do a DNS lookup to translate your Cloudant hostname into an IP address.
- make a secure HTTPS connection to the Cloudant service which involves performing a TLS handshake
- authenticate against the Cloudant service by exchanging your
passwordfor a session cookie.
The library handles all of this for us but nevertheless, it is an overhead. Our action is making two HTTP requests in series for each invocation of the cloud function.
You should be concerned about this because you are being charged for the execution time of each invocation!
Re-using the connection
To avoid making an authentication request each time your action is invoked, we can make use of some inside knowledge of how Cloud Functions works. When you deploy your action, the Cloud Functions platform turns your code into a container. The platform re-uses the same container again and again when invocations happen often, retaining the code’s global variable space. We can use this to our advantage to reuse the Cloudant connection.
If we store our Cloudant object in a global variable (as opposed to the local variable we are using now), our second and subsequent invocations will be able to re-use its data, which includes the authentication cookie. Our code now looks like this:
- there is a global variable called
dbwhich is to hold the Cloudant library object
- when the
mainfunction is called for the first time, the
dbobject is created with the Cloudant configuration. An
ifstatement ensures that the
dbobject is only created once.
- on the first write to the database, the Cloudant library will first exchange its credentials for a cookie, storing it in the
- subsequent invocations that reuse the same container will re-use the cookie inside
db, and if consequitive innvocations are close enough in time, they will also reuse the same HTTP connection, because the socket is kept alive.
The sample principle can be applied to code using the iam plugin, IBM Cloud’s Identity and Access Management system for authentication.
I thought global variables were bad?
They are in general, but in this case they allow state to be retained between invocations of our Cloud Function, making our application faster and saving us money!