Building A Serverless Slack Bot

Pocket Gems
Pocket Gems Tech Blog
5 min readJun 30, 2022

By Shimul Bhowmik and Karthik Golla

Pocket Gems Data Infrastructure Team

During the pandemic, when our data infrastructure (DI) team was working in different time zones, it was difficult to effectively utilize the engineering resources during on-call. Pre-pandemic, people used to report issues via email, Slack message, or in person. This made it difficult to track new requests, deliver proper updates, or provide visibility along the way.

We wanted to create a streamlined system, allow Gemmers to report any incident, and enable our engineers to act on the incidents on time. So we came up with the idea of creating a Slack, DI Incident Bot, which uses Slack’s interactive commands and Slack APIs to allow Gemmers to report incidents efficiently. The DI Incident Bot also integrates with Jira and PagerDuty to track and update the status of each incident.

In this post, I will explain the architecture we designed to help you figure out how to build a similar serverless system.

Why a Slack Bot?

We can use a Slack bot to automate workflows and communicate with external servers to extend the capabilities beyond simple user interactions. Building a Slack bot is easy — the APIs are intuitive, and the interactive components offer a rich user experience. We only need a few lines of code to handle the interactions and a place to deploy the code. In our case, we chose Google Cloud Functions (GCF) as the serverless backend system (AWS Lambda is also a good option).

High-Level Design

Component Diagram

Implementation

Let’s go step-by-step on how we designed our system. This should help you get an idea of the potential of serverless designs.

Step 1: Creating the user entry point

Slack is the main communication tool for Pocket Gems and was the natural choice for our entry point.

Slash Commands allow users to invoke your app using a command (for example /leave to leave the current channel). You can configure a URL that Slack will send the payload. We used our GCF endpoint where we can extract the user ID and the message they sent with the command (no message in our case since we used an interactive system discussed later).

You should configure the permissions and scopes to generate an OAuth Token and add scopes for your app (example, usergroups: read, users: read). You can see all the scopes here.

Step 2: Setting up the serverless endpoints

Ok, now that we have a Slack Command set up with a placeholder endpoint, we need to create an endpoint to make it functional. In each Google Cloud Function (GCF), you can specify the entry point method called for an incoming request. You get a ‘request’ argument in your GCF that contains data you collected using your Slack Command. You can go through the Google documentation on writing and deploying a Cloud Function.

We used one function as the entry point for all our Slack interactions in the sample below.

Sample code we deployed to GCF

Step 3: Interactive interface for users

A bot isn’t helpful if it only takes one command, right? We wanted to gather more information and do some heuristic analysis to assess the severity of the issue. For this to happen, we needed two things:

  • User-facing interactive components
  • A way for our GCF to communicate with the user

Slack Incoming Webhooks and interactive components come to the rescue!

With the Webhooks, you can send data back to Slack from GCF. Interactive components then help you create a UI to show in Slack (think HTML + CSS combo). Using a mixture of these, we created a workflow where we would ask Gemmers to answer some questions, and GCF would analyze them, send follow-up questions, and figure out the issue priority. Finally, it will post to our on-call channel with a summarized view of this and the SLA.

Sample output after out system decides on the priority

Step 4: Alerts !!!

Now that we have assigned a priority to the issue, we need to alert our engineers. We use PagerDuty for incident reporting in Pocket Gems. PagerDuty supports webhooks to open incidents. To programmatically call the service, you need to create an API KEY and use the API Key for simple API calls (we don’t recommend storing API keys in code, but that discussion is out of scope for this post). You can check out the official PagerDuty API documentation here.

Sample code to open a PagerDuty incident

Step 5: Tickets please

We also wanted to integrate our ticketing system with the bot. We used Jira for our CI/CD system and commit messages (for future reference). When building a bot, who would want to create and manage this manually?

Fortunately, Jira provides a developer API (Read more about the Jira APIs here). You can use Jira to sync both ways (Jira webhook). We connected our GCF to Jira. Now, as soon as we have enough information for the ticket, our bot can open a Jira ticket auto-magically.

This way, we connected Jira, Slack, and PagerDuty via GCF. Our engineers can now use any of those tools, and the bot will update the others. We didn’t have to worry about updating our customers or losing track of work. Happy Gemmers everywhere :)

Conclusion

The Slack Bot helped us improve our incident response system. We utilized several tools and took advantage of them without worrying about redundant work or communication issues. Oh, did I mention that we achieved all of these without worrying about maintaining a server? Pretty amazing, huh?!

Hopefully, this makes you all excited to go and try building your serverless systems in the wild.

The Pocket Gems central infrastructure team is continuously innovating and exploring technologies to solve our problems. If you are interested in technical challenges, including distributed systems, performance, and developing microservices, you would be the right fit for us. We are actively hiring candidates like you.

--

--