As you progress your career as a developer there is one thing that I think all senior developers can relate to. The more experience and “higher up the ladder you climb”, the less time you have to just write code. Being a Staff Engineer means you work by an entirely different schedule on entirely different tasks. Just check out the awesome site staffeng.com and especially the article on Staff archetypes.
I spend way less time coding now than when I started working, but come Hackdays you can’t even pry me away from my keyboard, hammering away code. At Schibsted we take Hackdays seriously. We have company-wide Hackdays twice a year and even combine them into big Product & Tech Festivals with well known external speakers, hours of hacking and a huge demo to round it off.
My absolute favorite thing to build during Hackdays are Slack bots! They tend to be rather small so you can finish within 24 hours, solve a specific problem with a simple solution and are great to demo live with Slack. A couple of Hackdays ago I set out to find a way to quickly get who is on-call for a certain team and get their phone number. We use PagerDuty for on-call rotation, scheduling and alerting so I went about building a Slack bot with Serverless Framework that fetched data from the PagerDuty API and wrote back who’s on-call in the Slack channel.
Making your bot talk
First off, let’s make sure we have communication up and running by having the bot react to a message and respond with a simple “Hej Världen” (“Hello World” in Swedish).
Start with creating your project and install dependencies
$ npm init
$ npm i -D serverless
$ npm i @slack/web-api node-pagerduty
I like to have serverless as a dev dependency in the project rather than a global package and then execute it using npx. It makes it a lot easier to maintain and make sure everyone is using the same version while allowing you to easily use other versions in other projects.
Serverless Framework uses a file called serverless.yml to configure the cloud provider, all Lambda functions and other resources that will be created in the deployment. Let’s add one with some basic settings and a handler function that accepts POST.
Afterwards we can create a simple handler file that logs the event and just responds with HTTP 200.
The challenge code in the short file above is needed to confirm the setup with Slack when creating the event subscription for the bot. The first request from Slack will contain a challenge that you need to respond back with to confirm the subscription.
Let’s deploy this first initial version with npx and serverless to create a needed resources and get the API Gateway URL for our Lambda function.
$ npx sls deploy
Create the Slack app at https://api.slack.com/apps/new and head over to Event Subscriptions and enable that feature. You need to input the url of your bot, which you can find in the serverless deployment output, and you should also subscribe to bot events “app_mention” and “message.im”.
Once properly configured it will look like this
Head over to “OAuth & Permissions” and add “chat:write” under Bot Token Scopes and then go to “App Home” in the left hand menu and enable the “Always show my bot as online”, “Message tab” and “Allow users to send Slash commands and messages from the messages tab” to support direct IM with app and not just in channel messages. Now you’re ready to add the app to Slack by going to “Install app” and install it to your workspace. You’ll get a Bot User OAuth token back, and store that somewhere as we’ll need it later.
Currently the app doesn’t do much so let’s make it react and respond. One thing learned when setting this up is that you need to respond to the Slack post within 3 seconds, otherwise the user will get an error message. Given Lambda cold starts and time to call PagerDuty and Slack APIs, that error message happens quite often. This is easily fixed by setting “async: true” in the serverless.yml file under your function configuration, which will make the AWS API Gateway respond with HTTP 200 immediately upon request. The async invocation does not play well with the Slack challenge as the challenge is always synchronous, which means you should only enable async invocation after the challenge has been completed. Another implication of changing to async invocation is that the AWS API Gateway integration changes from LAMBDA_PROXY to LAMBDA and the Serverless Framework will add request templates that automatically parse body to JSON and change the structure of the event. This has consequences later when looking at verifying signatures from Slack.
After removing JSON parsing and adding request back to Slack, we end up with this
One thing you might have noticed is that when writing to the bot in IM, it goes into a never-ending loop, reacting to its own message. The snippet contains an if-statement to ignore messages from itself. Try it out! It looks like this
Securing your integration
Last thing you want is to leak data or open up to be spammed by some well-crafted malicious requests. Slack provides two ways of securing your integration; the first is Verification Token which is deprecated, and the second is verifying the signature using the signing secret. Let’s go with the supported and not deprecated method.
So how do you go about verifying the signature? Slack has a great guide on how to verify the signature but a little light on actual code in libraries. There is an implementation deep within the @slack/events-api package but it requires a server setup to be run. We can just make our own.
To verify the signature you need the headers “X-Slack-Signature”, “X-Slack-Request-Timestamp” and the body as string. However, due to using async invocation and the body being JSON parsed by API Gateway, you need to JSON.stringify() the body before providing it to this method. That works initially, until someone™ decides to send a message including a unicode character. Suddenly the signatures do not match. Only way to fix this is using the actual raw body from the request which means you need to customize the request template.
To customize the request templates, you can copy the default templates from your API Gateway configuration and add the following line
Add the two request templates (application/json and application/x-www-form-urlencoded) to your code base and add the following lines to the http event configuration in your serverless.yml file, below “async: true”.
In the handler you now have “event.rawBody” available, containing the raw body as string to be used in signature verification. The snippet above also explicitly sets integration to lambda, which was previously just implied via the async directive.
Hooking up with PagerDuty
Now that we got the basics squared away, let’s hook things up.
When getting an idea on how to attack a problem, you start looking for resources and APIs available. A great source is the API documentation, and if there is an API explorer it’s even better. PagerDuty has awesome documentation and a great API explorer. Looking through the API documentation there were two methods that seemed to achieve what I wanted, the On-calls method and the List users on-call for schedule. None of these methods returned actual user contact information, just a reference to the user or reference to contact information. A subsequent API call was needed to get contact information and best suited for that was the List a user’s contact methods. I decided to go with List users on-call for schedule, but in retrospect the On-calls method would be better suited as it accepts multiple schedules as search criteria, which initially wasn’t in scope. You live, you learn.
Next step is digging through the PagerDuty SDK and finding the equivalent methods.
To get users on-call we need a schedule id and those are not that easily remembered, which means we need to map team names to schedules. We may also call teams by different names and a team might have more than one rotation, meaning more than one schedule.
To support the criteria above we can put teams in a json file with the following structure.
"schedules": ["P9XYZ9K", "PYXYZGN", "P3XYZ6F"],
"tags": ["blocket-all", "all-blocketeers"]
We need to be able to take the message from slack, parse out team names and alternative names and filter the team list based on those keywords.
It might look more complex than it is. First off it’s making a flat array of all valid names and alternative names, or tags. Then split the input string into words and filter out the ones matching any team. The last step is filtering the teams using the list of valid keywords.
Then we get the users on-call for all the schedules within the filtered teams, grab user contact methods, format a response and send it back to slack.
The task was divided into two methods, one to cycle the teams and the potential multiple schedules for that team, and another to fetch the user on-call and the contact details. You may also notice that the PagerDuty SDK is injected instead of instantiated. Both external SDKs, PagerDuty and Slack, are instantiated in the handler code and then injected into functions to be able to mock those away and unit test all parts of the Lambda function.
All code is available at https://github.com/schibsted/sls-oncall.
Let’s test it out! Here’s how it looks!
What is your next Hackday project? What APIs do you think would be cool to stitch together, or maybe even make a Slack bot out of? Now you got the tools to get you started!
PS: We’re hiring and have exciting positions in all our locations across the Nordics and Poland. Check out our open positions at https://schibsted.com/career/.