Track all the Statuses!

Shane Fast
BACIC
Published in
7 min readJan 31, 2022

I’ll level with you. Some days everything breaks. You’ll wake up, and it feels like half of the internet is in disarray. Users won’t be able to log in or use a huge chunk of your app, and the issue will be out of your direct control. Panic sets in; your heart rate rises.

You’ll realize that we stand on the shoulders of giants (AWS, Zendesk, Salesforce, etc.), but sometimes one of those giants will stub their toes on a big rock. The sickening crack will be audible across the horizon as it begins to echo in atrocious harmony with the following scream. That giant will jump up and down in horrible pain as the others vary in their reaction, from trying to help to laughing and mocking heartily. All the while, you and your friends will tumble about, holding on for dear life, and your friend Roy begins to fall as his tenuous grip fails.

You’ll attempt to catch him with your free arm, but in vain, your reach is just short, and your eyes lock momentarily with his at this ill-fated juncture. His eyes… his eyes are all you think about for weeks after as they live on every night in your dreams.

“Why couldn’t you catch me?” He’ll say while shrinking slowly into that deep abyss.

“Why couldn’t you catch me…?”

Roy falling, artist’s rendition. I hope his last moments were peaceful.

Things will return to normal, and everyone reassures you that it couldn’t be helped, but you relive the moment over and over in your mind. You turn to drugs, alcohol, and anime to silence the noise. Several months later, you find yourself wasting away wondering what went wrong.

.

.

.

Well, we have the solution for you! – though full disclosure, Roy is long dead and is never coming back; guts splattered finely across the rocky formations at the ground, grizzly stuff, very informative. However, you can prepare for such a catastrophe in the future. You can gather all the service statuses you hold dear into one spot and get a live glimpse that everyone on your team can see.

Are one of those giants about to sneeze? Thanks to this, everyone can dig in and hold on tight for the storm ahead. Let’s see how it works:

Take inventory

First, we need to get a solid list of services key to your team’s operation. Show your initiative and create a list with the obvious stuff, then get after some other folks around the organization to round that list out. This list doesn’t have to be perfect or complete. I promise you’ll hear all the complaints about what’s missing after you complete this whole process. Much more efficient.

Split services into tiers

Do this part however you want, but we split ours into two groups. One means “Panic,” and the other “Panic More!”. One is lists the essential services that our app uses, live, in production for critical functions (AWS, Zendesk), while the other is for all the auxiliary services that wouldn’t be an immediate threat to our user’s activities (Bitbucket, Hubspot). All important, but some are more importanter than others!

Those who know, know.

Create a dedicated Slack channel

Add the RSS feed and status page integration (follow this handy guide). It helps if you’re the Slack admin or can work closely with this person — otherwise, you’re just bashing rocks together.

Begin the wild hunt

The first thing you’ll need to contend with here is finding each status page for each item on your list (if they exist). You can try to google for “<name of service> status” or can also try replacing the subdomain of the service with “status” as in “https://status.<name of service>.com”

Assuming you find the status page, there are a few typical flavours you will come across. Also, save the URL for later (there is a handy trick you can do once you’ve got all the status sites lined up):

Type 1 - easiest, direct Slack integration

A status page that allows for direct Slack subscription. Click the button select the types of notifications you want for that service, authenticate, and you’re golden.

I trust you have this one well in hand. I’ll be disappointed otherwise.

Type 2 - is easy, but can be tedious RSS

A status page that doesn’t make a direct Slack connection but allows you to do an RSS feed. Navigate to the Slack RSS page you set up earlier and paste the link in. As an avid AWS customer, there ended up being dozens of critical RSS feeds individually curated for many services and regions. I feel this is great for thinning out a lot of the riff-raff notifications you won’t need but be prepared for a bit of monotonous clicking to catch them all.

If you find these images condescending, they are.

Type 3, the rare email-only subscription status pages

Sometimes you’ll find no other options exist on the service provider except email. No problem! Within Slack, you can add a dedicated email for a channel. Since we have created a dedicated channel for status notifications (you did that, right?), you can select to create an email from Slack to appear just in that channel. Use that email to subscribe to the status page, and you’ll be able to see notifications appear in Slack.

Subscribe to the email alerts using the email in the channel settings.
Here’s what it looks like in practice. I censored it out, but it was Zendesk.

Type 4: Status page with no method of subscribing or service with no status page at all.

While it is still good to keep the link to these status pages, they are lumped in together with the non-existent status pages because they are as helpful as ejection seats on a helicopter. If things start to crash, there is a lever to pull, but no one will hear the pilot’s report afterward. These can be tricky, and at best, you’ll only be able to get a cursory glance at their status at a distance.

In all honesty, unless the service provider is exquisite to your product’s operation (a definite critical service), I wouldn’t bother with the next steps.

First, ask yourself if there is either an endpoint or a login page to the service that is easy to ping to give you a good sense that things are at least reachable. Ok good. Now you may want to check that there are no legal or security issues hitting that endpoint or page because we will be doing so once every minute or so with a pinging service (ask permission first — be better than me).

We happened to use hyperping for the job, but there are many others out there, or you can build your own (just as long as there is a way for failures to appear back in Slack). Set up a monitor or ping to that endpoint or login page, and you’ll have yourself a desperate man’s status page.

This is how we do.

Organize your carefully gathered URL links.

There are a few ways you could go about it, but we opted to create an aggregate link list using an extension called OneTab for each service tier. Once we did that, we added each link to the Slack channel’s bookmarks.

Easy access links to each page. I censored the specifics, except to show that I almost got the alphabetical order correct. Sigh... someday.

Announce, boldly, your grand achievement.

Apply that rare occurrence you can seize to use the @channel power to proclaim your excellence and invite the team to partake in your constant anxiety.

My only regret is not doing this at 3 am.

Repeat the above steps until the complaints stop

Capture all the things you missed!… or don’t, I’m not your boss.

All jokes aside, this is helping democratize and add a more self-help element to status checks. Your team can feel more empowered to pop into the channel as needed to see if their favourite/critical tools are having issues without chasing down the status page after the fact or hunting for an obscure update from the service provider indirectly.

If you found this valuable or entertaining, please follow the blog, where I’ll continue to post more tech goodness. Thanks for reading!

--

--

Shane Fast
BACIC
Editor for

Interested in building things and building teams.