Don’t be that guy

Did you get that thing I sent you?

Ian Reis
Social Tables Tech

--

For our latest product, Propose, users are able to create proposals and email them to event planners without ever leaving Social Tables. But just sending the email isn’t enough, we wanted to be able to inform a user about the status of their email. Has it hit their inbox yet? Did they open it? Did they click though to the actual proposal? Which pages did the planner spend the most time looking at? These are all things that help properties make better decisions about how to win more business.

Currently we send all our emails using Sendgrid, a fellow BVP portfolio company. They offer two extremely important features that enabled us to deliver the precise solution we wanted: custom args and event notifications. Custom args allow you to attach arbitrary data to an email when you send it. Event notifications are a configurable webhook that Sendgrid will post to when certain events happen to your email like sent and delivered but most importantly when a user opens it and when they click a link in the email. And guess what, those custom args you attached to the email are posted with every event.

We settled on three custom args: appId, proposalId, and recipientId. Since we are sending emails from multiple applications, appId, which looks something like this propose:production, allows us to know which sent it. This will come in handy later. The other two custom args, allow us to make the appropriate updates when processing the notification.

Since we use one internal service to send all emails and SMSs, we configured Sendgrid to post all notifications to it but we didn’t want this service to have the responsibility for processing every notification. Instead, we filter all notifications on whether they have an appId and drop them on a queue for later processing. With this design, we decouple Sendgrid from the processing of the notifications.

Here at Social Tables we use AWS for everything, so naturally we will be putting our messages on an SQS queue. But before we do that, we want to add some extra information to each message. SQS enables you to add information to the message without modifying the payload with message attributes. First and most important is which secondary queue to route each message to. We also add the requestId that restify generated. This will enable us to trace the message as it moves through multiple systems. The last attribute we add is the current version number of the service that generated the message. This is more of a future proofing attribute in case the payload changes from one version to another.

Now that we have added all the attributes we care about, the messages are ready to be saved to the queue. If that request fails for some reason, we respond with a 500. Otherwise, we tell Sendgrid that their post was successful. If we’re happy, they’re happy.

With all the notifications that we care about on a queue, we can process them at our leisure. To retrieve them, we are using a node module, created by engineers at the BBC, called sqs-consumer. New problem, one of our other products, Check-In, has different needs when it comes to processing messages. Rather than have all that logic in one consumer, we pluck the messages off the global queue and route them to an appropriate product queue based on the message attribute, queue, that we attached .

Now I know what you’re thinking, why didn’t we just route them in the beginning. Good question! We didn’t want to have multiple calls to multiple queues be successful before we could return success to Sendgrid. This way as long as the one call to SQS is successful we can respond 200.

Now that we have routed our messages to the correct product queues, we can process them individually.

This design also has one major benefit. Because the notifications are routed to a queue rather than processed when they are received, we are able to set an appId to propose:local and it will be routed to a queue that a consumer running locally can pull from and process. This enables us to test the entire flow locally with out constantly pushing code to our test environment or editing code directly on the server.

Does this scale? The answer is yes… for us. Since we run all our services on Amazon’s EC2 Container Service with auto scaling, we can scale the different consumers up and down based on a cloud watch alarm on the depth of the queue. Another added benefit of using SQS is that messages are locked once they are “in flight”, so we can be certain that multiple consumers are not processing the same messages.

As always there is more than one way to solve a problem. This our solution. Hopefully this has inspired you, especially if you found this post by googling “how do i get email read receipts?” or something to that effect. If you want to discuss the pros and cons of this system, leave a comment. We’re all about feedback at Social Tables.

In an upcoming post we will go over how and why we moved to a similar architecture to send emails as well.

P.S. A big thank you to Sendgrid’s support staff. They were extremely responsive and actually helped expose some issues we had on our end.

--

--