The thing about distributed functions
After working with AWS Lambda for about three years it’s hard not to think it’s the solution to all our architectures. It’s great! It gives you isolation, scalability, simplicity, and plays very well with event-driven architectures.
Plus a bunch of other benefits:
One of the main challenges with functions as a service (and microservices) is figuring out what’s going on inside at a platform level: Distributed Systems can grow out of control easily, becoming hard to manage, and understand.
X-Ray is the de facto solution for debugging and analyzing distributed services from AWS. At the time it didn’t provide an SDK for NodeJS and that limited our adoption.
Envoy is the most widely adopted solution for containers. But I couldn’t figure a good abstraction that fits the serverless paradigm.
IOPipe provides tracing, alerting, and profiling. It integrates with lambda functions and has a plugin for the Serverless Framework.
After that I invoked the function, and jumped to IOPipe’s Dashboard:
You get a centralized view with information like number of function invocations, number of errors, memory and CPU usage, duration, cold start hinting, and aggregated log information.
But for meaningful information that can help us know what’s going on at a platform level you need to instrument the codebase: That means adding markers and labels before and after all the methods that you want to introspect:
const courses = await Courses.list()
const jsonapi = dataSerializer.serialize(courses)
Labels tag related functionality and behavior, helping us answer questions like which functions are using our GraphQL Client? and How much time in average is Session creation taking?
Markers add tracing and time spans to the behavior of the functions, helping us answer questions like what’s making this function slow? or which operations can be improved?
We instrument our model async methods, external services calls, data-layer calls (ORM, DB clients, SDK calls, etc), and data mapping methods. That provides enough information to peek into the platform:
In the example we can now see that in a listing operation a third of the time is authenticating the request token, most of which is getting users’ session. Then, almost half of the execution time is getting the data out of the database.
After seeing the data we came up with two straightforward strategies to improve the performance of the function:
- Adding a caching layer for the User’s session that only expires when they interact with an API that changes their state: Things like favoriting a Course and following a Topic or Lecturer.
- Adding a time to live (TTL) to keep Courses listings in memory for a reasonable amount of time; avoiding trips to the database.
Instrumenting functions opens up a window to look into the platform, unveiling behavior that would be hard to find and debug otherwise.
But adding labels and markers all through the code gets tiring. It makes the code look contrived, and gives you the feeling that half your code is just instrumentation. 🤔 🤔 🤔 🤔
Last week, Felipe Guizar Diaz and I were preparing our weekly Birds of a Feather Architecture session. He was preparing a talk on Frontend Reactive Architectures. We started by talking about Vue using setters and getters to implement reactivity. He prototyped a Store with event listeners, and used setters to notify the store of value changes. Then we moved onto implementing the same reactivity but with ES6 Proxy.
That got me thinking about Observability again…
Lyft uses Envoy Proxy to automagically instrument all their microservices. Envoy is an out-of-process proxy used on service-oriented architectures. It handles distributed tracing, metrics, logging, load balancing, service discovery, circuit breaking and retries, TLS terminations, etc.
Everything is abstracted from the applications’ logic by proxying the network. All service traffic flows via Envoy in a consistent, platform-agnostic way.
The Proxy object intercepts all operations of an object: property lookups, assignments, method calls, etc. Anything that happens on the wrapped object can be “seen”.
I sprinkled some IOPipe magic into the proxy implementation creating a wrapper object that autogenerates labels and markers from the object methods:
And wrapped all of our exports with that object:
module.exports = Observe(CoursesModel)
The proxied object behaves exactly the same, but it will trap all async calls, adding a label with the pattern
object-method, starting the marker before executing calls, and ending it afterwards. All this without changing the developer experience.
After removing tracing and labeling calls, the code looks clean. It behaves the same way as it did before while keeping all its observability benefits.
const courses = await Courses.list()
const data = dataSerializer.serialize(courses)
There are some areas where it’s still valuable to manually measure and mark our code, but for the most part this just works and it’s simple.
This approach gives us a consistent practice to reuse across the platform. I still like Envoy’s approach a lot more because it’s language-agnostic, nobody needs to think — or forget — about it and other additional features for free. But using ES6 Proxy to add tracing seamlessly feels like a step into the right direction for our serverless practice.
More resources 📚
- The book Exploring JS by Alex Rauschmayer has a very good chapter on Metaprograming with proxies.
- After writing this I found an article in his blog about Tracing method calls via Proxies.
- Maurizio Bonani has an article about using a chainable Proxy for their analytics tracker.
- Gidi Morris has a good article on Using ES6’s Proxy for safe Object property access.
- Matt Klein’s talk From Monolith to Service Mesh: Perfect intro to Envoy from the creator himself; goes into why use Envoy and all the things it does.
- Datawire’s Envoy Proxy docs has plenty of resources on Envoy, from what it is, to service mesh, and how to run it.
- Christian Posta has a good series about Microservices Patterns With Envoy Sidecar Proxy.