Forecasting Weekly Department Sales using DeepAR in AWS

Published in

Slalom Data & AI

10 min readJun 19, 2020

Automated Forecast Generation — by Evan Schaeffer, Xiaona Hu, and Matt Collins

Knowing what you need and when you need it would be a great superpower. Imagine having the foresight to stock up on toilet paper before a pandemic. While the challenge of predicting when hand sanitizer will become an official form of currency is nearly impossible, far larger and more important challenges are already being solved with the help of forecasting.

Forecasting has been used all over the world to combat food shortages, provide pharmaceuticals to hospitals and pharmacies, and manage the manufacturing activity of essential goods. Retail companies use forecasting to keep just enough supply on hand for the customers that will purchase. Airlines use forecasting to plan their routes and maintain a mobile population. The list goes on and on.

Each application of forecasting comes with its own challenges; two common challenges are:

Creating individual forecast models for every item, location, or situation is a massive undertaking
Having the compute power to generate forecasts at a massive scale can be limiting/expensive

In order to take on these challenges, we will take you through an end-to-end solution leveraging Amazon Web Services (AWS) to train and deploy a forecasting model in the Cloud.

AWS SageMaker provides a large list of pre-built machine learning algorithms that make model training and deployment a breeze. Some initial configurations will be required to make sure that your data conforms to the proper format, but once that is completed, these algorithms handle most of the additional work.

Results from this demo will be generated by DeepAR, one of AWS’s machine learning algorithms for time-series forecasting.

SageMaker DeepAR

Before we get into the architecture and output of our solution, knowing the benefits and drawbacks of using DeepAR will help you decide if this is the right solution for your business case. Check out Failing Fast with DeepAR Neural Networks for Time-Series if you’re considering the time and resource investment involved.

The most popular methods of forecasting that data scientists typically use are ARIMA models (Auto Regressive Integrated Moving Average) and ETS models (Error, Trend, Seasonality). These methods are great for individual forecasts because they allow for a lot of fine tuning and experimentation to build a strong model.

However, when you want to create 100 different forecasts, DeepAR generally outperforms these methods, and is much easier to implement. DeepAR has the capability to learn from other related time series. Every time series feeds one model, rather than 100 different models.

The following are additional benefits and drawbacks of using DeepAR, following the development of our forecasting solution.

Benefits

Automation
• Forecast thousands of separate entities using one process
• Model training, tuning, and selection

What-If Analysis
• Change variables during prediction
• Simulates the effect without needing to retrain

Brand New Time Series
• Make new forecasts using only feature variables, no historical target data required
• Note: Can’t include a new feature set that the model hasn’t seen before. If new variables are being introduced, a trial period should be used to collect target data

Drawbacks

Data Format
• Data must be formatted in a particular way. Find more about format here https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html#deepar-inputoutput
• Variables must be encoded, requires remapping post-prediction

ETL Efficiencies
• DeepAR requires the full set of data when making new predictions
• Growing datasets require built-in efficiencies when formatting and loading data

Model Performance
• All entities are tuned together
• No model tuning for individual time series
• Overall performance is fine, but some entities may suffer

Generating a DeepAR model in SageMaker was a three-step process.

Format Data

The data used for this demo represents weekly retail sales for 45 different stores with varying numbers of departments, totaling 2,660 individual time series. There are additional dynamic features and categorical features that feed our model as input variables. This dataset is a flat file with one row for every weekly observation.

In order to train a DeepAR model on this data, we must convert it from flat format to JSON. This process creates an object for every store-department combination, encodes categorical features as 0 to N-1 (where N is the number of levels to that feature), and tags it with the appropriate categorical and dynamic features. These JSON objects are output as three separate files: training, testing, and validation. Training and testing will be used by DeepAR. Validation was set aside for the purposes of this demo.

Hyperparameter Tuning Job

Once the JSON files are loaded into Amazon S3, we are ready to start training the DeepAR model. This can either be done with pre-specified hyperparameters or by running a hyperparameter tuning job. Hyperparameter tuning will be more expensive, as it trains multiple times in hopes of producing a more accurate model, but is very often worth the results.

Hyperparameter tuning is accomplished in SageMaker by specifying static values for unchanging hyperparameters (such as time series frequency or the prediction length) and ranges of values for arbitrary hyperparameters (such as learning rate or epochs). These values and some basic configuration options become the definition for your tuning job. The time this process takes will vary depending on the compute size of your training instances, number of iterations specified in the tuning job, and the size of your dataset. For reference, 20 iterations on 3 ml.c4.2xlarge instances with our sample dataset took 3 hours.

The number of iterations will determine your coverage of potential hyperparameter values. One way to drill down on your optimal hyperparameters without increasing the time it takes is to use a random subset of stores while increasing iterations.

Export Best Model

When hyperparameter tuning is complete, SageMaker will return every iteration as its own potential model. From these results we can collect the best-performing iteration and create a SageMaker Model for later inference.

Architecture Overview

Services used to develop our forecasting pipeline

Now that we have a working model, next comes model deployment — that’s how we integrate the model we trained into a workflow to generate predictions for a business. To do that, the main services we use in AWS are S3, Lambda, Glue, and Athena. The Model Deployment flow in the above architecture diagram shows how this process will run once set up and automated.

S3 is our primary storage service where we store all historical data that we want to create a forecast on. As updates to the dataset are collected over time, they can easily be uploaded to the source bucket for this flow and the rest of the process will run fully automated.

We use Lambda to run a batch transform job to generate predictions and post-process the results into a format that we can easily consume. The Lambda function is triggered when new data are received in a S3 bucket. Here, the assumption is that in a production environment, there will be another ETL process that writes clean data into this bucket on a weekly basis. Once the Lambda is triggered and run, it writes prediction results into CSV files in S3, which are then accessed by Glue and Athena. And finally, Tableau reads the data from Athena and shows the predictions in a dashboard.

Lambda

AWS Lambda is one of the most used AWS services. It is an event-driven, serverless computing platform. It lets you run code without provisioning or managing servers. The function only runs when trigged by predefined events and scales automatically. You only pay for the time when your code is running. This service supports code written in different languages, including Java and Python.

This diagram shows how AWS Lambda works in our workflow: an object, that is our prediction input data, is created in our S3 bucket. Once S3 detects the object created event, it triggers the Lambda function and executes the code with the input data creation event as a parameter. For S3 to trigger Lambda, we set up the permissions between our S3 input bucket and the Lambda function using our IAM role.

Inside the console for AWS Lambda, the designer shows the components of the service. The core components are the function and the trigger. A trigger is a service or resource that you have configured to invoke your function. Here, our trigger is configured to look for prediction data created in our S3 bucket. One Lambda function can support multiple triggers, so we can easily add another one here.

This is the Lambda function that creates our predictions and cleans the output. A function is the code and the runtime to process events. Runtime sets the language for the code — in our case, it is Python. A handler is the function that the Lambda runs when your function is executed. It has two components: the first is the file name and the second is the function name inside that file. Our handler for the function is named lambda_function.lambda_handler. The Lambda function passes the triggered event and the context to the handler when it’s triggered.

There are different ways to supply your code. You can write code directly in the editor if you don’t need any libraries. If you need to add libraries and dependencies other than the AWS SDK, or you use the Lambda API, you need to create a deployment package, which is a ZIP archive that contains the function code and dependencies. In our case, because we needed to use Pandas and Numpy to process the output, we created a deployment package and uploaded it directly. Another way to upload the package is to first upload it to S3 and then upload it within Lambda from S3.

AWS recommended best practices are to separate the Lambda handler from your core logic and use environment variables to pass operational parameters to your function. In our demo, we define a model_name variable to pass our DeepAR model. In the future, if we have a better performing model that we’d like to use, we can easily make the change in the environment variable section without the need to edit the function script.

You can also configure layers and destinations for your Lambda function, though neither are used in our demo. Layers are used to pull in additional libraries and dependencies. A layer is a ZIP archive that contains those libraries or other dependencies. With layers, you can efficiently re-use external libraries in your function without the need to include them in your deployment package every time you need them. A destination is an AWS resource that receives details about invocation results for a function. Some information that could be passed include event details, invocation status, and invocation response.

Glue and Athena

Once our Lambda function finishes running and writes cleaned prediction results back to S3, Glue and Athena access that data and prepare it for further analysis.

AWS Glue is a fully managed ETL service. We set up a Glue Crawler to automatically extract the table schema of our prediction results and store that metadata in the Glue Data Catalog. Data Catalog is natively integrated with Amazon Athena, which is an interactive querying service. Athena conveniently accesses the prediction results in S3 through Data Catalog. At this point, we have a table ready within Athena that we can query using SQL to view the predictions. We can stop here, or as presented in the next section, we further connect Athena to Tableau and build a live dashboard for visualization.

Live Forecasting Dashboard

Many data visualization software options offer live connections to AWS services. We’ve decided to use Tableau to build a live forecasting dashboard that visualizes the predictions from our DeepAR model once they exist in Amazon Athena. Establishing this connection is a simple process. Follow the steps provided by Tableau here.

Once connected, you can easily build any visualization that suits your needs. The dashboard below shows new predictions in blue for both stores and departments.

Dashboard showing historical data in gray and new forecasts in blue — Historical data in gray, DeepAR Forecast in blue

Given that this is a Live connection, as soon as updated store data is landed in S3, the model and subsequent ETL processes will be triggered and the resulting predictions will automatically appear in Tableau.

While this process demonstrates a weekly batch inference job, this pipeline could be utilized to enable more frequent prediction intervals as well, with the primary dependency being the time required for model inference and post-inference data processing.

Concluding Thoughts

The AWS suite offers every service required for quick and easy forecasting on a large scale. Amazon’s pre-built algorithms and deployment services don’t take much configuration to create an end-to-end pipeline that is fully automated. There can be a bit of a learning curve when using SageMaker models since the documentation, although comprehensive, can be difficult to search through. All things considered, we found this combination of S3, SageMaker (DeepAR), Lambda, Glue, and Athena to be a great solution to a problem faced countless times. If your forecasting problem involves generating predictions for many different entities, we recommend you consider this as one of your best solutions.