Building a simple serverless application status page

Denis Mysenko
Tixel Dev
Published in
4 min readNov 15, 2019

Prologue

Practically ever startup today has a system status page — typically a simple page with a few charts and incident history giving users a quick overview of the current situation, eg. are there any problems right now? Should I wait and try later?

It’s useful for staff, it’s useful for users. It helps avoid some of the support enquiries (users won’t ask what’s wrong because they already know you are working on resolution).

Some startups make their own pages from scratch, some startups use cloud services. In this blog post I’ll show you how we created a simple, completely serverless (not tied to our infrastructure) status page for Tixel in a one hour.

Ingredients

So let’s start. To cook this, I’m going to use:

  • Pingdom checks to take measurements regularly. Note that you could use AWS CloudWatch or something else, I’m just used to Pingdom;
  • AWS S3 bucket to host the compiled (rendered) status page and AWS CloudFront in front of it to make it faster, give it a custom URL and SSL certificate;
  • AWS Lambda function that fetches data from Pingdom using API and compiles a fresh version of the status page. This function is launched according to a schedule using CloudWatch;
  • Chart.js for nice looking charts.

The checks

In our example we will monitor latency and uptime of the most critical pages and a couple of custom metrics — the number of support tickets still open and a number of concert tickets pending re-issue (meaning there is a customer waiting for a ticket, so if the chart goes up — thats bad, if it goes down — thats good).

It’s trivial to add uptime checks on Pingdom and they are literally called uptime checks. All you have to do is to provide a target URL to test and a region to test from. To make custom checks, one has to add special endpoints for Pingdom that return XML (yes, XML in 2019!!):

<pingdom_http_custom_check>
<status>OK</status>
<response_time>0</response_time>
</pingdom_http_custom_check>

Response time can be anything — ignore the word. This could be the number of outstanding support tickets or number of coffees consumed by the office in a day or anything else. If the “status” is not “OK”, Pingdom will start panicking and texting you.

Once checks are in place, Pingdom will start monitoring and the data will be available through Pingdom API.

Once again, you could do it using CloudWatch that has an API too, or any other monitoring solution.

Render

Now that we have data available via the API, let’s display it in a nice way.

Let’s create a Node.js Lambda function that is going to have an extra file inside — “template.html”, the source template, and one trigger — a CloudWatch event:

My CloudWatch event fires every 5 minute based on AWS schedule expression rate(5 minutes):

The template.html file is the status page we want to display without actual data but with placeholders, eg:

<div class="m-widget4__info pl-0">
<span class="m-widget4__title">
Average latency
</span>
<span class="m-widget4__ext">
<span class="m-widget4__number tixel-color-pink">
%event_average%ms
</span>
</span>
</div>

Including JavaScript placeholders, soon there will be arrays:

var cityPageLatency = %city_chart_data%;
var eventPageLatency = %event_chart_data%;
var pendingTicketCounters = %locked_chart_data%;
var supportTicketCounters = %support_chart_data%;

It is job of our Lambda function to read this file, get real data via the API and replace all %% placeholders with actual numbers or arrays of numbers.

My function turned out to be less than 100 lines. There was no need to write “amazing” code because it’s 1) something you almost don’t have to maintain at all 2) users access the rendered static HTML file, this function is executed asynchronously. This is what it looks:

Now we have to give this Lambda function permissions to write to the S3 bucket passed through environment variable, enable public access to the S3 bucket and finally create a CloudFront distribution in front of our S3 (this enables extra perks like IPv6 and SSL on top of caching of static assets).

Some hints & suggestions

  • Your S3 bucket name must match your desired URL. In our case it’s status.tixel.com;
  • Set edge expiration for static CSS/JS/image assets to 365 days;
  • Use tree shaking (eg. purge-css) tools on your CSS/JS assets to make a teeny tiny status page.

Cost

  • We run the Lambda every 5 minutes and on average the execution time is 4 seconds. That’s 8640 invocations a month or 34,560 seconds a month. Which is far lower than the free tier of 3,200,000 seconds a month — so this is free for us;
  • Pingdom charges 45$ a month which is, I guess, a bit expensive considering how trivial monitoring is. You could save most of this cost by using CloudWatch. Note that even if you use a paid status page solution, normally they won’t monitor anything for you, you will still have to pay for Pingdom or something similar;
  • S3 and CloudFront costs are negligible with this kind of bandwidth and long-term caching.

Result

Voilà! We built a super fast, fully responsive and good looking (in my opinion anyway) status page:

Even if everything goes bananas and the whole of our infrastructure collapses, this page will stay alive since it’s not connected to anything of ours (it’s only connected to external, third party stuff). Which is exactly what we wanted from a status page.

--

--

Denis Mysenko
Tixel Dev

CTO and Co-Founder at Tixel, a passionate software artisan, aikidoka and scuba diver