Hackathon projects for fun and profit— Building a Slack/PagerDuty chat-bot

Gorjan Zajkovski
Inside BUX
Published in
6 min readJan 17, 2020
BUX Office, Amsterdam

Background

As an engineer, one of the things I love most is the automation of actions. Whether it’s a command-line script to execute some monotonous task, or a different kind of piece of code that helps me save time - optimizing some work that needs to get done in the process, the possibilities present themselves every day.

It’s even better if we can combine solving such a problem, with learning something new.

Here at BUX, we believe in full ownership of the product we are building. As a team, we sit down and listen to the requirements of the business, solve the problem, code the solution, and make that solution available for our users. Last but not least, we support and build-upon that solution after it‘s already running in production.

Motivation

Part of running a piece of software, in the wild, for real-life users, is making sure that the software runs as expected. One or multiple people need to make sure that the software works as planned, and be there to help if any hiccups arise, or if a user has troubles using the software.

Let’s name this person, that is responsible for keeping the software in order, the on-call engineer. This is the person that answers all questions related to the usability and stability of the software we are running. This is the person that people call when there is an issue. This is the person that should provide insights if something is not working as expected. It’s not always the same person of-course, and different people are responsible for different systems.

In a multiple-teams environment, wouldn’t it be great if everyone in the organization, would automatically know, who is the person to ask a specific question, without much hassle?

Solution

We use Slack as our communication tool. It’s awesome. It’s fast, intuitive, and easy to use. Most of all, it’s the way that people are used-to finding answers to their urgent matters. So what better way, than to offer a quick and easy way, for each of our colleagues, to get the best person to answer their question regarding some software we are working on?

Enter to the scene, our slack On-Call-Bot!

To build the On-Call-Bot, we will combine modern technologies to achieve ease of use, ease of development, fast and reliable deployment, and quick iteration when making changes.

To communicate with Slack, we use Slack’s API, in the form of slash commands. A slash command is a word, that starts with a slash (/), and in the background calls an endpoint which has been previously configured. Alongside this command, the user can type more text that will be sent to the endpoint, which can serve to make a more specific query or to base a decision.

To find out more about slash commands, or how to configure one yourself, check out the Slack API documentation, available here.

Credits: Slack API documentation

Next, we need to run a piece of code, that is gonna receive the call from Slack, and return the desired result, in our case, the name of the on-call engineer.

For this purpose we will need 1) some logic, a piece of code that will return the answer to our question, and 2) a place to run our code.

1) Resolve — Who is the on-call engineer?

For this purpose, we use PagerDuty, a tool that monitors your services, notifies you of failures, and keeps track of who is responsible for what, in your infrastructure. In this case, it is the source of truth, of who people should go to, in the time of need.

So, the logic of our On-Call-Bot is pretty simple. People use the bot to ask who is on-call, and the bot asks PagerDuty, and transfers back the information. This particular implementation uses NodeJS, and JavaScript as the language of choice. All the code for the on-call bot is contained in one file, and the logic is broken down in the next couple of sections.

At the start, we import the additional libraries we need:

We also need to define some custom configuration, kept in the config.json file, containing the Slack security token, the PagerDuty API token, and our on-call schedule id. The file will look something like this:

Next, we check the integrity in the request coming-in for our chat-bot, validating whether it really came from Slack.

Next, we define the logic for requesting the needed data from PagerDuty, define for which dates are we interested, process the response and return the data we are interested in, in the Slack response.

A lot of thing going on in this last snippet, so let’s break it down, line by line:

Line 1 -> PagerDuty keeps the on-call data in entities called schedules. We define the ID of the schedule we are interested in.

Lines 3–22 -> We define some helper functions to get the dates we are interested in, correct and properly formatted.

Lines 36–66 -> We accept the request from Slack, verify the Slack security token, parse the query and set up the data to perform the call to PagerDuty. We also check in what time-frame the user is interested (defaulting to ‘today’).

Lines 68–83 -> We perform the request to PagerDuty and parse the response.

Lines 85–93 -> We return the answer to Slack, simply formatted as a 1 line message. (See this document, if a more complicated response is desired)

Lines 94–99 -> We handle any errors that might have popped-up.

2) How/where do we run the code?

For this purpose, we need to think about the architecture of our chat-bot system. On the one hand, Slack needs an endpoint to call, every-time the command is executed by one of our colleagues. On the other hand, we only have a short simple piece of code, that asks the PagerDuty API for some data, parses that data, and returns it. We would like to be able to develop, deploy, run and maintain this piece of code, in a modern way, having the ability to develop the logic in whichever language we choose, without waiting too long to deploy, being able to run it with a simple command. A good choice satisfying all of the above presents itself in the form of Google Cloud Functions.

Google Cloud Functions, offers us a way to run a short piece of code, namely, a handler for the API call that will come in from Slack, when our chat-bot’s help is required. We can deploy a function in a very fast and easy way, it’s highly available and fault-tolerant, and is event-driven, exactly what our current use-case requires. Assuming we have a GCP account, and some basic project set-up, to deploy our function, first, we need to enable the Cloud Functions API in Google cloud, following the instructions outlined here. Then we can deploy our bot with the first command, in the following file:

and delete it with the second.

Having the architecture in place, all that remains is configuring our command in Slack to call the correct endpoint (the Request URL input field - as denoted in the second picture in this blog-post), at the point when the chat-bot (app) is added to the Slack workspace.

The result is that everyone can now reach the person that can help them with their questions.

Invoking the On-Call-Bot
Getting an answer back.

For a more detailed overview of the architecture, see: https://cloud.google.com/functions/docs/tutorials/slack

For a more detailed overview of how to set up a Slash command, see: https://api.slack.com/tutorials/your-first-slash-command

For a more detailed overview of how to use the PagerDuty API, see: https://v2.developer.pagerduty.com/docs/getting-started

And if you would like to learn more about the things we are building here at BUX, head down to our open positions, and contact me if you have questions about anything at all…

--

--

Gorjan Zajkovski
Inside BUX

Software Engineer. Passionate about technology, personal growth, music and football.