Autoencoders Applications in Store Returns Anomaly Detection

Annie Ma
Walmart Global Tech Blog
5 min readAug 19, 2021

Store returns in retailers

Retail giants, that process numerous sales every day across all U.S., also generate a considerable number of merchandise stores returns although returns represent only a small percentage of sales. With an aim of improving customer shopping as well as return experience further, it is imperative to analyze the underlying customer return patterns and particularly anomaly returns using state-of-the-art machine learning techniques.

However, there are a few challenges that need to be overcome when performing anomaly detection on store returns:

  1. As mentioned previously, for large retailers, the in-store return volume can be huge as the sales volume is substantial. As a result, the store return datasets are super large.
  2. It is not feasible to have human experts review every store return and identify return anomaly. Therefore, there are seldom predefined labels for this kind of prediction problem.

As you might already know, supervised learning is using an algorithm to learn the mapping function ƒ(.) from the input x to the output Y:

Without the output Y, we have to apply the unsupervised learning or semi-supervised learning approaches to characterizing or uncovering the underlying distribution and structure for the input x.

3. Data is enormous, however, return anomalies are exceedingly rare to be identified. Identifying and investigating return anomalies may help improve merchandise quality to optimize return process, and most importantly deliver world-class customer experience.

4. There is always seasonality in store sales as well as returns. As such, return anomalies are indeed moving targets and fast-changing across regions. To cope with volatility and uncertainty, unsupervised learning or semi-supervised learning approach is superior to supervised learning when it comes to store returns anomaly detection.

Why autoencoders?

Autoencoder, as unsupervised neural networks, compresses the input data into lower-dimensional representation/embedding, and then reconstructs the original input based on the representation/embedding.

Specifically, an autoencoder consists of three components:

· Encoder: this part comprises fully connected feedforward neural networks, which compresses the input data into a reduced-dimension latent representation.

· Code or bottleneck or compressed representation: this part includes important features and representations about the input data, which allows the decoder to recover the input as much as possible.

· Decoder: this part also consists of fully connected feedforward neural networks, which reconstructs the output from the latent representation.

An example of autoencoders architecture

Autoencoders have a symmetric architecture and learn the nonlinear relationships between the input and embedding. The objective of autoencoders is to compress the data into latent features, and reconstruct the original input as accurately as possible, and employ mainly one type of loss functions as follows:

Mean squared error (MSE), also known as reconstruction error, is defined as the mean difference between the input and output:

MSE formula

where M is feature dimension, N is number of data points in the dataset.

We choose autoencoders as the preferred technique for returns anomaly detection because:

· An autoencoder does not require predefined labels as it compresses the input and reconstructs the output through the latent features.

· As a deep learning-based approach, an autoencoder outperforms the traditional machine learning methods when data get bigger.

· An autoencoder is trained on store returns of common patterns, and thus it will learn the patterns of most return transactions and recognize their latent features. When a return anomaly comes in, the model is expected to generate high reconstruction error as it behaves differently from the majority of the returns.

Applications in store returns anomaly detection and performance

Data

We trained an autoencoder deep learning model on returns data with common patterns. And we tested the model performance on another out-of-time return data.

Features are some information about the current return visit such as item categories, time of the day to do the return, etc., and some other valuable information like store/region level return patterns, etc.

Model architecture

The autoencoder model we implemented has 6 layers and owns a simple symmetric architecture design with node numbers in each layer as 279 — 128 — 64 — 64 — 128 — 279, where 279 is the input and output dimension, 128 is the encoding and decoding dimension, and 64 is the hidden/latent dimension.

Activation function is ReLu and Adam is used as the optimizer. Lastly, the model is trained for 200 epochs.

Model performance and comparison

We compared the autoencoder model against the gradient boosting machine (GBM) model that is already in production that supports store returns anomaly detection. What follows is how we evaluated the autoencoder model performance:

  • Ran the test data set through the autoencoder model as well as the gradient boost machine model and noted the reconstruction error and GBM model score for each store return input.
  • Based on the autoencoder reconstruction error distribution and GBM model score band, we divided store returns into different buckets. For instance, the store returns input is classified into top X% (X = 0.01%, 0.05%, 0.1%, etc.) buckets of autoencoder reconstruction errors (in descending order). Likewise, the store returns input is mapped to top X% (X = 0.01%, 0.05%, 0.1%, etc.) buckets of GBM score (in descending order).
  • In each of those buckets for the autoencoder model and the GBM model, a domain expert examined each store return thoroughly and decided if an anomaly case was found. We recruited several domain experts to review all test cases.
  • Next, we calculated the percentage of return anomalies based on domain experts’ decisions and the percentage of overlaps between top X% bucket for the autoencoder model and top X% bucket for the GBM model.

Important findings and further steps

  1. In those top buckets (i.e., 0.01%, 0.05%, and 0.1%) where the return anomalies rate is over 80% for both the autoencoder model and the GBM model, there are very few overlaps in store returns input. It suggests that the autoencoder model can capture different return anomalies than those by the GBM model, and thus provides promising incremental lift.
  2. In the top buckets from the autoencoder model, we will investigate the returns and their behavior which were never on our radar until now. We will derive data and business insights and produce actionable items.

Conclusion

In summary, autoencoders are especially useful unsupervised deep learning models for anomaly detection, and it will provide promising incremental lift when combined with the traditional supervised learning models.

--

--