Building Heroku ChatOps for Slack

The development and design decisions behind Heroku’s new Slack app

Stella Cotton
Jul 25, 2017 · 7 min read

Teams have been using chatbots to manage repetitive tasks for decades, from Eggdrop bots on IRC, to Hubot on Campfire, Slack, and HipChat. At Heroku, we wanted a way to release and manage our applications from a conversational interface, so we began developing an internal ChatOps tool.

Heroku ChatOps brings the operational processes that happen behind the scenes, on a single engineer’s laptop, to the forefront in a collaborative interface. Teams use shared tooling to produce a transparent workflow. When Heroku migrated to Slack last year, an opportunity arose to share this product with Heroku customers.

With ChatOps, we’ve streamlined the tasks of releasing your pipeline-based applications, viewing their latest releases, and receiving CI notifications, all within Slack.

Pipeline notifications and deployment from Slack

As the engineers who worked on this app, we feel the Slack platform afforded us opportunities we never would’ve been able to accomplish without a modern messaging system. In this article, we’d like to give you a peek into how we extended the Heroku platform by creating a Slack application. We’ll cover why we chose to use Slack’s slash commands, give an architectural overview of the application, and walk through some security considerations and error handling we encountered when building our app.

Slash commands

We configured an endpoint in the Slack UI that points to our Heroku ChatOps Rails application: when someone types a Heroku ChatOps command, like /h promote my-pipeline, Slack will POST the command they submitted and additional metadata to a callback URL in our application. Authorization to this callback URL is handled by a shared secret between Slack and our ChatOps application, which validates that the user, command text, and team is coming from Slack and not a malicious third party.

Multi-OAuth

Instead of building our own access control inside the ChatOps Slack app to manage user capabilities, we prompt people in Slack to authenticate directly with both GitHub and Heroku. We can store tokens and later act on that user’s behalf so that any action is linked back to them. These tokens are encrypted at rest with RbNaCl and refreshed every six hours.

A user’s access to Heroku Pipelines and GitHub repos is managed through the Heroku Dashboard and GitHub, respectively. This way, if a user leaves their company and their access to Heroku is revoked, that change is immediately reflected in Heroku ChatOps – and that person will no longer be able to deploy their company’s application from Slack.

Command processing flow

Slack gives us only 3 seconds to acknowledge their initial POST, so we send back an HTTP 200 OK response immediately to let Slack know that we received their message. This acknowledgement isn’t to be confused with the response we send to Slack, which is displayed to the end user — that will come later.

The job we enqueued earlier is then tasked with parsing, and ultimately executing, the user’s command.

Command processing architecture

Pipeline notifications

An example of a pipeline notification in Slack

We were able to build on existing functionality at Heroku, which already receives relevant GitHub webhooks for Heroku users’ pipelines. We receive messages over a Kafka-powered stream that we can use to match up your events and route them to the correct channel based on the pipeline.

At installation, we provision a chat:write:bot token which allows us to write to your Slack channels without any initiation from a team member.

Pipeline notification architecture

Must love regex

Heroku ChatOps uses some complicated regex to distinguish between different kinds of user input. Our most intricate regular expressions are over 100 characters long. We keep these under control in a few different ways.

One way is through the single responsibility principle: we treat each command individually. First, we separate out the command itself (e.g. promote) and route it to a unique parser class. The parser’s sole job is to separate the rest of the command into individual pieces.

Breaking it down this way lets us easily write unit tests to ensure different kinds of inputs do what the user expects, like /h promote my-pipeline, /h promote my-pipeline from staging to production, and even /h promote this is an invalid command.

Instead of one giant regex string, we build our regex as a list of strings that are joined together at the end. We extract portions of the regex into helper methods with descriptive names, and describe each one with a comment. Ruby’s Regexp library allows us to capture pieces of the input.

pattern = [
"(promote)", # task
"(!)?\s+", # forced?
valid_pipeline, # pipeline name
"\s*", # optional space
"(?:from\s*#{valid_slug})?", # optional stage
"(?:",
"\s*(?:to|in|on)\s+", # to
"#{valid_slug}?", # optional downstream
")?"
]
matcher = Regexp.new(pattern.join(””))

But the best user input is one that doesn’t involve a complicated regex. Slack’s interactive messages let you generate buttons and drop-downs in Slack, so that your users can kick off interaction using a slash command, and are presented with a fixed set of options in return. While Heroku ChatOps is still very CLI-like, we’re exploring more ways to incorporate interactive messages into the app flow to improve the user experience.

Now that we’ve parsed the command and made any relevant API calls, we build up a formatted response to send back to the user in Slack. We then post back to a response URL that Slack provided with their initial POST to our application.

In order to cut down on some of the noise these notifications create, we’ve implemented Slack threads in our messages. This means we do our best to group similar messages together, such as deploying, releasing, and restarting messages.

Using threads to contain related notifications

“We’re sorry, something went wrong!”

When Heroku ChatOps receive a runtime exception, we rescue it and send it to our on-premise installation of Sentry along with the team, user, and command id to help us debug at a later time. This approach helps us quickly identify why we’re receiving errors, and whether they’re widespread or localized to a specific team’s setup. We return a custom-formatted error to our user so they have some feedback about why things did not go as planned.

We also make use of Sidekiq’s built-in retries. Sometimes we’ll see 503 and 504 responses from the various APIs we hit. These API requests get retried with exponential backoff, as well as logging, to Sentry – so we can keep an eye on how widespread an API problem might be.

The future is bright

We were able to build our application with a familiar Rails/PostgreSQL/Sidekiq stack. And we’re in a much better place to test, verify, and maintain Heroku ChatOps than we would have been if we had tried to build for a range of platforms.

There’s also an opportunity for us to create more sophisticated workflows later, because everything is built around Heroku Pipelines.

If you’d like to check out Heroku ChatOps, visit the Heroku DevCenter to learn more and install the app.


This blog post was co-written by the Heroku Tools Team: Corey Donohoe, Reid McFarland, Stella Cotton, Thomas Balthazar, & Yannick Schutz

Slack Platform Blog

Several bots are typing…

Stella Cotton

Written by

Ruby developer @heroku

Slack Platform Blog

Several bots are typing…

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade