Building a REST API with Google Cloud Functions
As part of my work at Brandwatch we have been evaluating Google’s serverless cloud product known as Google Cloud Functions. The bulk of testing has been with internal tooling, but more recently it has been used on a small scale in production for small components of customer facing applications.
I’m an advocate for serverless platforms like Lambda, Azure Functions, and Cloud Functions. There are a large amount of compute use-cases at a company like Brandwatch where there are excellent advantages in having a totally managed platform, providing isolated function deployments, with a pay-per-invocation pricing model.
More generally I‘m excited about serverless (functions as a service specifically) as an extension of certain cloud native principles in general: moving the unit of compute, deployment, and cost, to smaller and smaller components. Namely from bare metal, to VMs, to containers, and now to functions.
Google Cloud Functions (GCF) is raw compared to the industry leader AWS Lambda. However, Brandwatch are a customer of Google Cloud, and as such some of our infrastructure sits in their data centres, and we have an interest in evaluating the potential of GCF as a low friction FaaS option.
My team has been working on a tightly scoped internal API which seemed like a good candidate for experimenting with GCF. We’d initially planned to write an ExpressJS application and run it on AppEngine, so moving to GCF (which heavily leverages the Express APIs internally) was a low friction decision from a code point of view. Here’s what we learnt.
GCF is badly missing an HTTP gateway layer. Google has a product called Cloud Endpoints which can sit in front of raw compute instances, App Engine, or Kubernetes Engine; but it doesn’t yet integrate with Cloud Functions. It’s an obvious fit for it to do so, and I assume Google are planning something in this area.
The key aspect of this will be integration. GCF currently makes it very easy to deploy a function exposed on an HTTP endpoint, and it would be a shame for that ease of use to get tangled up with Endpoints/Swagger configuration. This complexity in Lambda is one of the reasons frameworks like Serverless and .architect exist — and it would be nice to see Google solve that with a strong integration between Endpoints and Cloud Functions.
Due to the lack of a gateway layer, routing for HTTP triggers in GCF is quite blunt. You can’t route by HTTP method, or by URL path, meaning that you end up almost completely reimplementing routing inside your functions.
It also means that to design a RESTful style API you end up implementing multiple operations inside a single cloud function, which somewhat breaks the principle of small deployable units. For example, our design for the API endpoints initially looked like
The best routing match that GCF offers for this collection of endpoints is
/objects`. Which means using a single cloud function deployment to match those four operations. We may as well have deployed an express app.
In fact, for early iterations I did end up requiring ExpressJS into the function to handle method/path routing. Eventually we decided that the smaller discreet function per operation was a more important aspect of the design than the RESTful properties of the API design. The endpoints now look like…
This is far from ideal, but I’m happy with the trade-off at this point, and hope that any future Cloud Endpoints integration solves this problem.
GCF provides no native mechanism for easily loading externally managed key/values into a function. This is something that is also surely on the roadmap, as this pattern is extremely well established, particularly in the PaaS ecosystem (twelve-factor apps, etc).
It’s possible to work around this of course, reading different files off disk for different environments. But implementing this adds un-needed complexity in our own application code, when the feature should be built in at a platform level.
Another feature required here is the ability to load in secrets (e.g. secure tokens/keys) from a location outside of the deployed source code.
Internal HTTP triggers
I would like to be able to trigger HTTP functions without going through an external load balancer. I think this is desirable mostly from a performance point of view, if calls to synchronous functions end up getting chained together. I would generally try and avoid that as a pattern, but knowing you have direct access to an endpoint on the same internal network would make it more palatable.
More broadly I wonder if making the platform API to be at the level of a container, or more specifically a Dockerfile, would be a good fit for serverless. AppEngine Custom Runtime supports providing a Dockerfile as part of your source upload. AppEngine builds the image and starts the container to run your application. I can’t think of a reason why this wouldn’t work as well at the function level.
The functions we have tested so far do very little work, simply connecting to Google Datastore to write/read data and then returning a blob of JSON.
Start-up of cold functions seems particularly poor, often waiting for 5–7 seconds. This is clearly not an acceptable wait time, so there’s either something going wrong here, or it’s something Google badly need to improve. It’s not uncommon to implement an external service to hit cloud functions to keep them warm, but again it breaks the abstraction of not having to care about details of the underlying infrastructure and isn’t something I want to think about.
Firebase includes Google Cloud Functions with some wrappers that support some of their other services and tooling. In fact some of the problems I’ve outlined here have solutions in Firebase.
In general though I’m just confused about Firebase and how it fits alongside the wider cloud platform. It requires creating Firebase projects, which have a separate and distinct console, a separate CLI with different patterns and paradigms for creating services. Even the exposed API for cloud functions is different. This is fine if you’re building a “Firebase app”, but in terms of Google’s wider serverless platform, it really muddies the waters in terms of documentation, community support, and integrations.
We’ve chosen to ignore Firebase for now, hoping that future improvements to Cloud Functions themselves provide similar integrations without introducing the management and process overhead of what’s essentially a third-party.
Although Cloud Functions is a long way behind Lambda I’m hopeful that as the service moves into general availability, Google are able to take the best from what we’ve learnt from Lambda over the last few years, and provide a really compelling serverless platform.