Build a personalized newsletter with AWS cloud services and ElasticSearch

Published in

Kaliop

4 min readOct 15, 2019

For a french version of this article, click here.

I am part of a team that develops a content platform which is accessed essentially from our newsletter. Our platform is built with a lot of cloud services, and some of them are used to generate our newsletter.

We’ll go trough our newsletter specifications and then dive in its implementation.

Requirements

Our users are interested in different topics, and our content is very specific. We need to be able to send interesting content for a specific user based on the information we have. We don’t want to send a generic newsletter that would not be interesting for a user looking for specific content in his/her field.
Our implementation needs to scale well, as our platform is regularly deployed in new countries, and our user pool grows accordingly.
The newsletter needs to have a manual trigger in a web interface, with a debugging tool.
We need to be able to explain why a specific content has been picked over another.

Architecture

We use multiple services. The following diagram shows how they interact with each other.

We trigger an AWS lambda from our back-office.
The lambda fetches users that are eligible to the newsletter (depending on their consents and other criteria).
The lambda pushes a message in the AWS SQS queue for each newsletter that we want to send. The message contains everything needed to send the newsletter (email, topic of interests, etc…).
A lambda is listening to the SQS queue. Every time a message is pushed, a lambda is triggered to handle it. This resolves our scaling problem, as we can trigger as many lambda as we need to process our newsletters.
When this lambda is triggered, we send a request to ElasticSearch to determine which content has to be sent to the user. (more on this later in the article).
Now that we have the content IDs, we send a request to Contentful (the headless CMS where our content is hosted) to get the actual content (and not just the IDs).
With the Contentful data, we generate the newsletter template using Nunjucks (a templating library that we use with Node.js).
With the message information (email, …) and the email template, we send a request to Mandrill (a transactional email service from Mailchimp) that handles the rest for us.
[OPTIONAL] We push records regarding the newsletter stats (and potential errors) in DynamoDB in order to give a feedback to the user who triggered the newsletter from the back-office (number of newsletter sent, errors, etc …).

That’s a lot of things to do, but thankfully, it’s also just a chain of small tasks to perform. What is great about this architecture is that everything is broken down to simple steps and you can easily change things without breaking everything.

Initial trigger

The architecture uses a back-office that triggers a lambda trough API gateway to push the SQS messages. We chose this because it was simple for us, but you can use anything you want to push the messages (script, web hooks, etc …).

Handling personalization

This is a brief explanation of how we use ElasticSearch as a recommendation system. If you want the full details, check out my article on How we built a reversible recommendation system using ElasticSearch.

The idea is that we define a list of criteria and we order them. For example:

the user has not read the article
the article matches the user topics
Any boolean criterion you want to test

We then use ElasticSearch function score to give a weight to each criterion.

When we send a query to ElasticSearch, the response contains the articles ordered by their score. It also includes the article score itself in the response. This score can be reversed to obtain the criteria.

Final thoughts

Using lambda and an SQS queue to handle our newsletter process worked out well for us. Writing in dynamoDB allows to us to know what happened for each newsletter batch.

We used Serverless framework to deploy both the “trigger lambda” and the “SQS-listening Lambda”, which simplified the deployment process. We also used serverless-offline and serverless-offline-sqs to develop the newsletter, allowing us to simulate everything that would happen on AWS without deploying.

Overall, we are satisfied with the solution and will continue to use these AWS services.