AWS SES as a trigger for a lambda to automate the boring stuff…

Art Rozovsky
7 min readDec 16, 2022

--

At Fender, we try to automate everything; not because we are lazy but because we don’t like doing menial, repetitive tasks. I have detailed below how one such boring manual task was successfully automated.

If you ever experience a situation where you can get a notification only over email (but not an API call), and need to take programmatic action, then this post may be interesting to you. Below, please find a sample of how we at Fender use AWS SES as a trigger for a lambda to automate the boring stuff.

We have a slack group “@oncall” which has just one user who is the devops oncall engineer at the moment, so any developer or data engineer can get immediate help and escalation just by typing “@oncall” without searching through the PagerDuty schedule. Also, we would like conversations with
the oncall person to not be anonymous, so that folks would know the name of the oncall engineer at the time of escalation. Therefore, the task was to update the “@oncall” Slack group automatically every time that PagerDuty on-call scheduler changes, and to post the name of the oncall engineer in the “DevOps” channel as a channel topic when this rotation takes place.

First things first — we started to look at the PagerDuty API documentation. We were able to find an API call to pull the oncall person’s name. But we did not want to spam PagerDuty every 5 min via a cron task to get this info. On the other hand, we did not find any API which would help us get a push
notification when the scheduler changes... however — PagerDuty has a great feature under the “PagerDuty User Notification Rules” (https://yourorg.pagerduty.com/users/XYZ09), where each PagerDuty user can configure an email notification rule like so: “Immediately before I go on-call short email me at abcd@efgh.com”.

We decided to explore this avenue to get an email notification each time the scheduler changed. Each DevOps engineer would add this rule with an AWS SES identity email address, which would then trigger a lambda that would process this notification, and update slack accordingly.

Based on this PagerDuty notification option above we made the following plan:

1. Create Route 53 MX record as a sub-domain of the domain which we control, something like email.yourorg.com
2. Use the MX record to create identity email address at AWS SES. Let’s use the email address oncall@email.yourorg.com for the purposes of this article.
3. Validate your email identity and make sure it has a status Verified.
(https://docs.aws.amazon.com/ses/latest/dg/creating-identities.html.)
4. Use the identity email address at (https://yourorg.pagerduty.com/) for each DevOps engineer’s notification rules settings.
5. In order for AWS SES to know what to do with the incoming emails to the identity email address we need to set ses_receipt_rule_set and make this ruleset active. This ruleset is used to trigger the lambda that analyzes the incoming emails.
6. Set up a lambda which filters the email and parses for the necessary info, and updates Slack via Slack’s API.

I intentionally omitted some sub-tasks like creating an aws iam role policy or aws lambda permission, for example in the task list above, so as not to pollute the high level logic explanation with too many details. But you can use our terraform module to see all the necessary steps that were implemented, below.

Also, please note that in order to be able to send the API calls to Slack, you need to create a Slack app, which is a one-time task and out of the scope of this post. We call the Slack app OnCallBot, and all the Slack user group manipulations and channel topic updates are done by the app via API calls from the lambda function. In our implementation, we store the Slack API token in the AWS Parameter Store, so the Lambda can access it securely. We may write a blog post in the future on how we use the AWS Parameter Store in our serverless architecture, and get into more detail then.

I would like to elaborate on how we were able to verify a new email identity address, as it does not have a valid inbox associated with it at this point, but we need to be able to receive email from AWS to be able to click on the verification link . (Please note that it was an intermediate step which was done just once via the AWS console.)

1. We created an email receiving rule verify_ses.yourorg.com_email , which published to Amazon SNS topic each time an email comes to the identity email address.
2. We created an validate_ses_email SNS topic and subscribed a valid email address to the topic.
3. After the subscription you should get the following email :
You have chosen to subscribe to the topic: arn:aws:sns:us-east-1:XXXXX:validate_ses_email. To confirm this subscription, click or visit the link below (If this was in error no action is necessary):
Confirm subscription

4. Now you can click on the “resend” button on your console at AWS SES identities email address and then check your email inbox for the email with the subject “Amazon Web Services — Email Address Verification Request in region US East (N. Virginia)” and click on the verification link.
5. Hurray! Now your email identity is verified !

Now let’s take a look at our terraform module. The file structure of the module is as follows:
$ tree oncallbot/
oncallbot/

├── pd_scheduler_notification.py
├── pd_scheduler_notification.tf
├── pd_scheduler_notification.zip
└── variables.tf

The Terraform code to build the pd_scheduler_notification app are detailed below:

terraform {
required_providers {
aws = {
configuration_aliases = [aws]
}
}
}

data "aws_caller_identity" "current" {}

####### Route53 ##############
resource "aws_route53_record" "mx_record" {
zone_id = "${aws_route53_zone.yourorg_com.zone_id}"
name = "email.yourorg.com"
type = "MX"
ttl = "300"
records = [
"10 inbound-smtp.us-east-1.amazonaws.com"
]
}
####### iam ##################
resource "aws_iam_role" "pd_scheduler_notification" {
name = "${var.region_code}-${var.environment}-pd_scheduler_notification"

assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": [
"lambda.amazonaws.com",
"ses.amazonaws.com"
]
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF

tags = var.common_tags
}

resource "aws_iam_role_policy" "pd_scheduler_notification" {
name = "${var.region_code}-${var.environment}-pd_scheduler_notification"
role = aws_iam_role.pd_scheduler_notification.id

policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:GetParametersByPath",
"ssm:GetParameterHistory"
],
"Resource": [
"arn:aws:ssm:us-east-1:${data.aws_caller_identity.current.account_id}:parameter/global/slack_oncallbot_token"
]
}
]
}
EOF
}
####### Lambda ################
data "archive_file" "pd_scheduler_notification_lambda_package" {
type = "zip"
output_path = "${path.module}/pd_scheduler_notification.zip"
source_file = "${path.module}/pd_scheduler_notification.py"
}

resource "aws_lambda_function" "pd_scheduler_notification" {
function_name = "${var.region_code}-${var.environment}-pd_scheduler_notification"
description = "To parse PD notification email"
role = aws_iam_role.pd_scheduler_notification.arn
runtime = "python3.7"
handler = "pd_scheduler_notification.lambda_handler"
filename = data.archive_file.pd_scheduler_notification_lambda_package.output_path
source_code_hash = data.archive_file.pd_scheduler_notification_lambda_package.output_base64sha256
timeout = 900
memory_size = 1024
tags = var.common_tags

environment {
variables = {
SlackTokenParameter = "${var.slack_token_parameter}"
RegionCode = "${var.region_code}"
Environment = "${var.environment}"
NotifyChannel = "${var.slack_notify_channel}"
UpdateUsergroup = "${var.slack_update_usergroup}"
}
}
}

resource "aws_lambda_permission" "allow_ses" {
statement_id = "AllowExecutionFromSES"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.pd_scheduler_notification.function_name
principal = "ses.amazonaws.com"
source_arn = "arn:aws:ses:${var.region}:${data.aws_caller_identity.current.account_id}:receipt-rule-set/lambda_parse_email:receipt-rule/lambda_parse_email"
source_account = "12345678"
}

####### SES ################
resource "aws_ses_email_identity" "main" {
email = var.receiver_address
}

resource "aws_ses_receipt_rule_set" "main" {
rule_set_name = "lambda_parse_email"
}

resource "aws_ses_receipt_rule" "main" {
name = "lambda_parse_email"
rule_set_name = aws_ses_receipt_rule_set.main.rule_set_name
recipients = ["${var.receiver_address}"]
enabled = true
scan_enabled = true
lambda_action {
function_arn = aws_lambda_function.pd_scheduler_notification.arn
invocation_type = "Event"
position = 1
}

depends_on = [
aws_lambda_permission.allow_ses
]
}

# Activate rule set
resource "aws_ses_active_receipt_rule_set" "main" {
rule_set_name = aws_ses_receipt_rule_set.main.rule_set_name
}

We use the following terraform variables :

variable "region_code" {
default = "use1"
}

variable "region" {
default = "us-east-1"
}

variable "environment" {}

variable "slack_token_parameter" {}

variable "slack_notify_channel" {
default = "devops"
}

variable "slack_update_usergroup" {
default = "oncall"
}

variable "common_tags" {
default = {}
}

variable "receiver_address" {
default = "oncall@email.yourorg.com"
}

And finally the Lambda code, which you need to configured with your organization values:

import json
import os
import requests
import boto3

ses = boto3.client("ses")
ssm = boto3.client("ssm")
RegionCode = os.environ.get('RegionCode')
Environment = os.environ.get('Environment')
SlackToken = os.environ.get('SlackTokenParameter')
NotifyChannel = os.environ.get('NotifyChannel')
UpdateUsergroup = os.environ.get('UpdateUsergroup')

slack_bot_token = ssm.get_parameter(Name=SlackToken, WithDecryption=True)
SLACK_TOKEN = slack_bot_token['Parameter']['Value']

slack_usergroup = {
"oncall": "6PNKP0KEPMTD"
}

slack_channel = {
"devops": "DEJXEEX5QJ"
}

slack_user = {
"James Bond": "5FG6XLNQCY",
"Captain America": "UAA8EU88LU",
"Leia Organa": "74RWH7QYGC",
"Vito Corleone" : "QA71BSUSTH"
}

def get_on_call_user(ses_event):
headers=ses_event['Records'][0]['ses']['mail']['headers']
#print(headers)
subject=[]
for header in headers:
if header["name"] == "Subject":
subject.append(header["value"])
break
print(subject)
if subject[0].split()[0]=='[PagerDuty]':
if ' '.join(subject[0].split()[-6:])=='(Level 2 - Ops Tier 2)':
print("This lambda could not process email with the escalation policy (Level 2 - Ops Tier 2)")
print(subject)
return(-1, -1, -1)
elif "Data" in ' '.join(subject):
print("This lambda could not process email with the escalation policy for the Data team")
print(subject)
return(-1, -1, -1)
else:
pd_user_first_name=subject[0].split()[1]
pd_user_last_name=subject[0].split()[2]
pd_user=pd_user_first_name + " " + pd_user_last_name
else:
print("This lambda could not process email with the following subject:")
print(subject)
return(-1, -1, -1)
print("pd_user: " + pd_user)
print(slack_user[pd_user])
if slack_user[pd_user]:
print(slack_user[pd_user])
return pd_user, slack_user[pd_user], slack_channel[NotifyChannel]
else:
raise Exception('ERROR: NO SUCH SLACK USER')

def authorization_debug(slack_token=SLACK_TOKEN):
return requests.post('https://slack.com/api/auth.test', {
'token': slack_token
}).json()

def slack_update_topic(topic_txt, slack_channel_id, slack_token=SLACK_TOKEN):
return requests.post('https://slack.com/api/conversations.setTopic', {
'token': slack_token,
'channel': slack_channel_id,
'topic': topic_txt
}).json()

def slack_update_usergroup(slack_user_id, slack_channel_id, slack_usergroup_id=slack_usergroup[UpdateUsergroup], slack_token=SLACK_TOKEN):
return requests.post('https://slack.com/api/usergroups.users.update', {
'token': slack_token,
'channel': slack_channel_id,
'usergroup': slack_usergroup_id,
'users': slack_user_id
}).json()

def lambda_handler(event, context):
name, slack_user_id, slack_channel_id = get_on_call_user(event)
if name == -1:
return {
'statusCode': 400,
'body': json.dumps('Invalid email subject line.')
}
else:
topic_txt="Oncall: " + name
#print(authorization_debug())
print(slack_update_topic(topic_txt, slack_channel_id))
print(slack_update_usergroup(slack_user_id, slack_channel_id))
print("Slack was updated!")
return {
'statusCode': 200,
'body': json.dumps('Slack was updated!')
}

The module above is invoked using the following terraform code:

module "oncallbot_trigger" {
providers = {
aws = aws
}
count = var.environment == "prod" ? 1 : 0
slack_token_parameter = "/fender/slack_oncallbot_token" #tfsec:ignore:GEN003
source = "./oncallbot"
environment = var.environment
common_tags = {
env = var.environment
owner = "Fender"
}
}

variable "environment" {
default = "prod"
}

--

--