State of serverless observability in 2018
Serverless has been the topic of a lot of discussions in the past 12 months and I have to admit, I was pretty much starstruck the second I’ve looked into the technology so naturally when given the chance to test it out I dove right in, let me tell you what happened next.
If you happen to follow me, you know that I love frontend development and I love my job as a frontend developer so when serverless promised a way to sidestep all the messy backend nonsense like server management, security updates, and scaling concerns, saying that my spidey senses were tickled would be an understatement.
So I got to testing and found out setup is extremely simple. After installing the Serverless framework and logging in to AWS I had the first expressJS site up and running on Lambda in a matter of minutes. I followed this article on deploying a serverless website and I could have probably deployed the entire thing in 15 minutes, as advertised, but I took some time to figure out the AWS login thing that. Still, having a website running for what is basically free, with the AWS Free tier, in such little time is still amazing.
But that was a few months ago and I have since got more into serverless and while there are many reasons for switching from a traditional server to serverless, there are some drawbacks that you need to take in account. People that used serverless, complain about limitation or cold-starts but I believe that those don’t qualify as drawbacks. Limitations are set in place to avoid you writing bad code that runs for minutes on end, costing you a fortune when or the cold-starts that aren’t the evil they are often painted as. Cold-starts are the reason behind the great scaling that you get with serverless. In order for your new containers to expand old ones need to die off. It’s that easy.
So what drawbacks was I talking about earlier? I’m glad you asked! My main complaint in regards to serverless technology is something the lack of observability in the app. Debugging can be hellish and while we do get access to AWS Cloudwatch that can help you in that regard, it won’t really help you when you have hundreds of functions that you need to sort through. The solutions? Third party serverless monitoring platforms.
The best serverless observability platforms:
While there are a lot of monitoring, analytics and observability platforms for serverless, I’ll only go over the ones I’ve tested and feel comfortable talking about. So without further ado, I’ll begin with the one I like most and then list the next ones in no particular fashion.
Dashbird — “an observability tool for the masses”
Disclaimer: I’m not getting paid to write this but if anyone gets any funny ideas my Paypal is: firstname.lastname@example.org
While new to the whole serverless game (although you could argue that the entire game is new) Dashbird have carved a nice chunk of the serverless monitoring space with an easy to use service with an intuitive UI, no bells or whistles. It’s exactly what I need to get to the bottom of those pesky bugs within my app without breaking a sweat. They made it simple, read more here.
So I started out with the free trial and while they offer a free tier right off the bat, I really wanted to try out everything that the app has to offer and they did not disappoint. While having practically no limits on the size of my AWS logs I started the import at around 13:25 and in about 5 minutes, I had my dashboard up and running. I really don’t know what I was expecting in term of setup time but I figured it would take more than 5 minutes to import my entire CloudFromation stack.
I must admit that before I started looking into a third party solution for monitoring my application I was worried about the overhead that those services might add to my functions. I’m talking both time and money here since AWS bills me for both the invocation and the execution time but I was pleasantly surprised to find out that Dashbird adds ZERO overhead. There’s no wrapper I need to add, no code to change. All the data comes from the logs they get from my AWS account.
I know what you are thinking, “Yo Ben, how come you agreed to give your AWS info to some dudes you don’t know”. I’d be remiss not to mention the privacy and security concern bugged me at first so without hesitation, I jumped on their support channel and asked them about that and while I was expecting some pushback I got to talking with their support staff — which I have to say: awesome people! — and they explained exactly what and how they manage and store my data in a clear enough way that I felt comfortable with them. Here’s a link that describes the entire process.
They’ve recently added API Gateway support which adds a new layer of visibility into your app, something I find really usefully, perhaps my favorite feature besides their alerting features.
- Great UI
- Awesome support
- API Gateway integration is a big plus
- No overhead to execution time or cost
- Setup is easy and painless
- Free tier is great
By this point, you can probably tell I’m a big fan of Dashbird but to be honest there are other services out there that have great products too. While I can’t test every single one, I’ll share my thoughts on the ones I did test.
IOPipe is great, they do pretty much everything, from alerting and real-time metrics to profiling and tracing. They have very useful webhooks for alerting that can send messages through slack, email, and PargerDuty which helps you sleep easy at nights, knowing that you’ll get notified the second your system gets an error.
They way they work is that you wrap your function in a small bit of code that communicates with the IOPipe app, thus monitoring the health of your serverless application. While they have great documentation on how to get started and I found it easy to deploy the newly wrapped functions I found that (at least in my case) adding that extra bit of code to my function ended up adding to the function’s execution time which at least, in theory, made my AWS Lambda bill more expensive. Now I know this wouldn’t necessarily be a deal breaker for most people, most services use this setup to monitor your serverless applications so this is not by any means, something new.
While considering going to IOPipe, or any service that adds any extra code to your function, one should take in account the fact that when they do decide to switch monitoring solutions they will have to remove that bit of code from every function and redeploy the app. Again, not a huge deal, especially since IOPipe is very helpful in that regard.
Another awesome tool that provides a lot useful insight into your serverless application including alerting, optimizing, cold start detection, instant discovery, custom metrics, filtering amongst others.
Just like IOPipe, SignalFX adds a wrapper around your function that connects asynchronous to their service collecting data while monitoring your application. While prone to the same downsides that IOPipe has in terms of overhead and that pesky “monitoring solution vendor lock-in” that has you edit every function and redeploying it to remove it from your stack, SignalFX works with most of the BIG FaaS players in the market right now, from AWS Lambda, Google Cloud Function to Microsoft Azure. So basically it will allow you to monitor a serverless app that is deployed on multiple platforms. This might not be of consequence to most people, as most will stick with AWS, I figured it is still worth mentioning.
While there are lot’s of service providers in the Serverless monitoring space I’ve only had a chance to have a good look at these 3 but I look forward at looking at the other ones in the near future.
Choosing one that works for you is more of a personal preference and while this is just my thoughts and I’m sure not everyone agrees, I would highly recommend you look at as many options as possible before choosing one as I believe switching between them later one can be tricky, especially between the ones that require redeploying your code.