How to monitor a serverless chatbot (for Hangouts Chat)

Mike
4 min readDec 26, 2018

--

Visit http://www.mikenikles.com for my latest blog posts.

Introduction

This post is part of a series where you can learn how to develop, monitor (and later debug and profile) a serverless chatbot (for Hangouts Chat). Please refer to the other posts linked below for context.

Monitoring

Congratulations again, your first bot is deployed and ready to be used. Time to rest… well, not quite. It’s time to learn a few more Google Cloud tips & tricks to make sure the chatbot runs smoothly.

Let Google Cloud notify you when errors occcur.

I recommend you install the Cloud Console (Android & iOS). It sends push notifications for new errors so you and your team can respond immediately.

Error Reporting

To view and manage errors, open the Google Cloud Console and follow these steps:

  1. In the navigation, click Error Reporting.
  2. On the right-hand side, click the blue “Turn on notifications” button.
Error Reporting & Management

In addition, you can click on individual errors to see more details:

Error details

Set the resolution status at the top right from “Open” to “Acknowledged” etc as you analyse and resolve issues. If you use a third-party project management software such as ClickUp, you can link to an issue for visibility across engineering and business teams.

Logging

Centralised logging is available in your Cloud Console navigation by clicking Logging. The default view lets you select your apps down to a very granular level, such as an individual Cloud Function’s INFO messages for a given time period, or in real-time by clicking the play button at the top.

For more advanced searches, switch to the advanced filter by clicking the dropdown arrow to the right of the filter input field. What’s nice is it converts your basic filter to an advanced filter, which you can use as a starting point for further customisation.

Logs-based metrics: Now it gets powerful! Logs-based metrics are among my favourite little helpers in Google Cloud Platform. Let’s say you have a Cloud Function that you expect to run once a minute (such as the reminder-bot-checker function we built for the chatbot). We can create a logs-based metric to make sure that function runs every minute. First, let’s create a custom metric that looks like this:

A custom metric to filter successful Cloud Function executions for the reminder-bot-checker.

So far so good, but that doesn’t really help us much yet. We want to monitor that one of these log statements is printed exactly once per minute. Luckily, Google Cloud Platform provides a (more or less) one-click solution for that. Read on.

Alerting

Click on Monitoring in the Google Cloud Console, this opens Stackdriver. You have to configure a few things if it’s your first time opening it, but it’s straight forward.

Once Stackdriver loads, click on Alerting > Create a Policy. There are four sections to fill out, let’s do that one by one.

Conditions: Click Add Condition. In the Target section, type “reminder-bot” and select the one entry available. Next, set the Aligner to count. Lastly, at the bottom in the Configuration section, set the condition to “is absent” | “3 minutes”. Save the condition.

Notifications: Select Email and provide your dev team group email alias, then click Add Notification Channel. For details on all notification options, please refer to the documentation.

Documentation: Another neat feature that provides a lot of value. Instead of guessing what’s going on when an alert triggers a notification, use this field to provide instructions for the notification recipients on what’s going on and how to fix what’s broken. You can use both Markup and variables to be as specific as possible.

Name this policy: I recommend naming conventions, boring names make life so much easier as your team grows. E.g. “Reminder Bot Checker — Test Regular Execution”. Remember to save the policy.

Validation: So this is great, we have confidence the system will notify us when the reminder-bot-checker function executes fewer than once per minute. While Google Cloud Platform is incredibly powerful and reliable, let’s make sure the alert and notification mechanism does work as intended.

The simplest way I can think of to test that is to delete the Cloud Scheduler reminder-checker job. Don’t worry, we have a deploy command available in packages/reminder-bot-scheduler to quickly create the job again. Alright, to delete the job: gcloud beta scheduler jobs delete reminder-checker. Give it at least 3 minutes and check your emails. Nice, eh? Details on what failed, including your notes on how to fix the specific issue. By the way, as you would expect, the Stackdriver dashboard also alerts you of the issue.

Next, let’s start the scheduler job to get everything back to normal: npm run deploy:reminder-bot-scheduler. Give it another few minutes and you’ll get an email letting you know it’s all good.

Summary

With a few clicks, we created an alerting policy that makes sure the bot checks for reminders every minute. Also, we enabled error notifications and installed Cloud Console mobile apps. The combination of all that is powerful and helps you stay on top of your chatbot once runs in production.

👏 ❤️

--

--

Mike

I no longer write on Medium. Follow me on X @mootoday or www.mootoday.com for blog posts.