Manage Serverless Machine Learning Workflows with Amazon Step Functions with the example of Email Campaigns đź’Ś

Michael Triska
AMARO
Published in
7 min readApr 8, 2020

--

Written by Michael Triska — Machine Learning Architect at AMARO

Session-Based Fashion Item Recommendation with AWS Personalize — Part 2

Mallorca Aesthetics by © Michael Triska

“I think about how we can use Vogue as a platform for change, a platform for activism, a platform, very importantly, for the fashion industry.” — Anna Wintour

Technologies should complement, not replace, the human face a company turns towards its customers. Developing a deep understanding of your customers plays a critical role in the maintenance of their trust and your ability to engage them with personalized messages through the channels that they prefer. At AMARO we’ve started to send email campaigns with product recommendations from AWS Personalize. The first release of our new recommendation architecture took us 2 weeks of development and increased our click-through rate on emails with product recommendations by 20%.

Normally, these kinds of machine learning processes still tend to be sluggish, unautomated and difficult to scale. That’s why in this blog post, we’ll present for you how we implemented a fully automated serverless machine learning lifecycle with the example of personalized email campaigns. This series of blog posts are structured as follows:

Use Cases

To achieve personalization with product recommendations in emails, we have the requirement to use different strategies for different campaign uses cases. You can, for example, send event-triggered emails for products that are back in stock and recommend a similar product, or you can show other recommendations in the newsletter. Questions then arise: Do we consider the latest popular trends (without personalization needs) from our assortment rather than the long-term preferences of a user (highly personalized)? How do we ensure that a variety of products are being displayed with a good aesthetic without human supervision? What is the throughput of recommendations the API requires? To answer these questions will require different datasets, algorithms, retraining, and architectures.

“It’s fascinating to see that there is no single path to success in fashion.” — Emily Bode

AMARO email campaign example with product recommendations.

AWS Step Functions for Machine Learning Pipeline Automation

“If you aren’t careful you can end up with data scientists literally emailing Python notebooks and models to engineers for production deployment.” — Kyle Gallatin

In order to scale fast, reduce the maintenance cost of production machine learning pipelines, improve engineering productivity, and increase the experimentation rate, the data team at AMARO uses AWS Step Functions for machine learning pipeline automation. AWS Step Functions is an orchestration service that allows you to build resilient serverless workflows at scale, integrating multiple AWS services. Combined with AWS Personalize this gives you a powerful service that orchestrates an automated pipeline to pre-process data, then train and publish API endpoints.

Image 1: Step Function Machine Learning Workflow.

As we can see in the following architecture diagram, we deployed an AWS Step Functions Workflow that synchronously calls a Fargate task and contains AWS Lambda functions to call Amazon Personalize or Slack. Workflows are made up of a series of steps, with the output of one step acting as the input into the next. The high-level process looks like this:

1. Trigger and Input Processing. The Step Function execution receives a JSON string as input from an event trigger and passes that input over to the first state in the workflow. The input string flows from state to state and depending on the input parameter, we can create different models backing a campaign. In this way, the workflow becomes scalable and generic enough for a variety of use cases even outside of email campaigns. For example, if you change the strategy input, the backend code will use a different dataset or algorithm parameter.

2. Preprocessing, Dataset Creation and Testing. We first use Step Functions to execute an ETL job with ECS Fargate container task in the step “Unload and Upload Dataset”. Amazon Personalize can make recommendations based purely on historical user interaction data as well as real-time event data. For automated email campaigns, we use historical data.

In the ETL pipeline, we create a user interaction data set from click events and eliminate data debt, through testing for any obvious issues before unloading data from our data warehouse to S3. Such debts include heavy-tail or head-heavy user and item distributions as well as repeated user-item activities or duplication rates. Personalize uses the concept of “Dataset Groups” to isolate your dataset, schema training model, and campaign endpoints from different use cases. To reduce the complexity of the Step Function State Machine, we create the dataset resources and import the dataset from S3 to Personalize within the container task rather than with additional Step Function states.

3. Create a Solution and Campaign Endpoint. We use AWS Lambda functions to map the rest of the workflow that you would normally do manually in the AWS interface to select a recommendation algorithm, train a model, extract experiment results or create the campaign endpoint.

And VoilĂ . You have successfully trained a Personalize model and now can get recommendations for your users leveraging your campaign endpoint.

Tips to Overcome Pitfalls Working with AWS Personalize

Carol Beer — Little Britain 2003–2006.

AWS Personalize is still a very new service and comes with some limitations. The following difficulties arose while adapting to the new service:

  • There is no monitoring of the training itself (training epochs, current training hours). Monitoring could help avoid overfitting the model. Even an engineer from AWS told us to be careful with the HRRN metadata recipe as it tends to overfit the model.
  • Don’t use AutoML. It will take too long and also costs much more than a normal training but might not bring any increase in your experiment results. You could use the SIMS recipe as a baseline and play around the HRRN recipe.
  • Like for all machine learning products, you should know your dataset and run basic statistics on it. In our first trial, we had 1 single training with 1500 training hours on a relatively small dataset because we had many anonymous users without user_id included. The algorithm tried to learn patterns on a flawed dataset.
  • Once you start your model training, you can not stop solution training early, that is, in the “Create in progress” status. This is especially bothersome in development mode when you just want to test the workflow and be fragile.

Additional Business Logic with a Lambda Proxy Integration

After we created the endpoint hosted by AWS Personalize with Step Functions, we needed to add additional business logic to the recommendations to avoid bad user experience by, for example, showing products that are out of stock. Image 2 shows the high-level architecture solution for the given business request.

Image 2: High-Level Architecture Solution.

We set up an API Gateway endpoint that gets requested by Braze, our customer relationship tool, to get recommendations for a user and send them in an email. A Lambda Proxy contains business logic. The following Github can function as a reference on how to set up a Lambda-Proxy integration with the serverless framework and a corresponding simplified version of our Lambda Function.

serverless.yml for the Lambda-Proxy Setup in the Serverless Framework.
Python Personalize Campaign Lambda Function.

Conclusion

Machine Learning is more than writing Jupiter notebooks if you want to have an impact on your user, whether it’s email campaigns or direct customer-facing products in your Front End. Perhaps the most serious concern for most work, especially in research,in the machine learning field is that they do not provide detailed insights into the datasets. This lack of transparency makes it difficult to assess trust, causality, transferability, and informativeness. Services like AWS Personalize or Sagemaker or Step Functions help you to move away from the abundance of maintaining servers and slow development but focus on what matters: “Getting dirty with the data”. You should always remember that despite the impressive results of machine learning techniques, Zhang et. al. (2016) observed:

“Deep neural networks easily fit random labels. More precisely, when trained on a completely random labeling of the true data, neural networks achieve 0 training error. The test error, of course, is no better than random chance as there is no correlation between the training labels and the test labels.”

Stay tuned to follow up on our next blog post where we will analyze our dataset.

Dame Sally Markham — Little Britain 2003–2006.

--

--

Michael Triska
AMARO
Editor for

Machine Learning Architect at AMARO. German 🇩🇪 based in São Paulo 🇧🇷. Information Science Master at Humboldt-Universität zu Berlin. Get Dirty with the Data.