Predicting the daily number of e-commerce orders for logistic success

Published in

Decathlon Digital

8 min readJan 24, 2024

What if it were possible to accurately forecast e-commerce order volumes to optimize warehouse staff planning? Good news, it’s possible thanks to our new e-commerce order forecast solution! 🙌 🔥 Find out in this article how this new algorithm guides you in your decision-making, planning, and resource allocation for short-, medium-, and long-term logistical success.

Estimating the number of e-commerce orders in advance is a critical task for logistics

In order to enable e-commerce logistics, piloting experts need to accurately predict how many resources (people, space, equipment) to allocate on a daily basis for the picking, packing, and delivery phases to effectively meet the demand of sports enthusiasts. Each country has its own piloting expert and its own process to manually estimate the number of orders placed on its e-commerce sites. While they put a lot of effort into this, it is an extremely difficult task to do given the highly uncertain times that the e-commerce industry is going through.

Image 1 illustrates the outcomes resulting from various forecasting scenarios

⚠️Until now manual e-commerce orders forecast have been too far off the actual number of orders placed by our customers over a given period. These large deviations impact an inadequate adaptation of the human resources in logistics as well as additional internal and external transportation costs, which directly impact the customer promise.

Data Science solution

This article presents an innovative Top-Down forecasting methodology tailored to multi-country e-commerce orders. The approach combines SARIMAX with Fourier terms for short-term accuracy and integrates TBATS for robust long-term strategic planning. The distinctive feature of this methodology involves “Top” aggregating time series data from warehouses specific to each country into a unified time series, this step is calculated by summing all warehouse time series. After training our different models using the aggregated time series of each country, the process concludes with a meticulous “Down” disaggregation. This involves calculating the weight of each warehouse within a country, enabling us to derive the warehouse forecast by multiplying it with the country-level forecast.

These weights are calculated by summing the number of e-commerce orders for each warehouse over a 28-day horizon and then dividing by the total number of orders for the entire country during the same period.

Image 2 Illustration of the Top-Down forecasting methodology

This approach has the advantage of solving the problem known as “cold start”, i.e. predicting the number of e-commerce orders for new warehouses that don’t have an important track record. The top-down approach, on the other hand, aims to enhance accurate and strategic forecasting for new warehouses, thanks to aggregated data at a country level.

Modeling

Metrics

The modeling phase typically commences by establishing a metric for comparing existing solutions or a basic baseline against the developed algorithms. There are numerous prediction metrics available in the literature, and for our purposes, we choose two different metrics: the Weighted Average Percentage Error (WAPE), a data science metric, and the Mean Absolute Percentage Error (MAPE), which is more tailored to the business side.

These metrics provide comprehensive insights into the performance of our developed algorithms, addressing both technical and business-oriented considerations.

Models

First, we implemented the business statistical model (Baseline) based on the calculation of a growth rate. It measures the evolution of the number of e-commerce orders during a specific period. As a result, the forecast is calculated by multiplying the growth rate based on the number of orders of the last 4 weeks and orders from the same period of the previous year (n-1).

With:

(y_date)n: the number of orders for a given date in the previous year n-1
(y_date)n-1: the number of orders from the corresponding day that we aim to forecast based on data from the previous year.
(y_j)n-1: the number of orders from the corresponding day that we aim to forecast based on data from the previous year.

However, this model has not proven its ability to deliver good results, particularly in countries with a strong variation in online sales. In order to calculate the growth rate, we need at least one year of historical data, which is not the case for certain Decathlon warehouses, and we have to assume that the growth rate calculated based on the last 28 days remains constant over the period we want to predict.

In what follows, we’ll present a comparison between this approach and the combination between the SARIMAX and TBATS models.

SARIMAX with Fourier Terms

The Seasonal AutoRegressive Integrated Moving Average eXogenous variables (SARIMAX) model is an extension of the traditional SARIMA model that incorporates exogenous variables to improve forecasting accuracy.

Autoregression (AR): refers to a model that shows a changing variable that regressors on its own lagged, or prior, values.
Integrated (I): represents the differencing of raw observations to allow the time series to become stationary
Moving Average (MA): incorporates the relationship between an observation and the residual error derived from a moving average model applied to its lagged observations

SARIMAX has proven to provide state-of-the-art solutions, particularly for time series data with seasonality and external factors that may influence the variable of interest such as:

Time index,
Week number,
Month,
Public holidays,
Binary feature for lockdown periods,
Black Friday…

Using the auto_arima() function from the pmdarima package, we transform the historical data through differentiation (I) to achieve stationarity and automatically calculate the optimal values for the ARIMA’s parameters based on the Akaike Information Criterion (AIC).

Yet, Sarimax exhibits a notable limitation, it can only accommodate a single, predefined seasonality, in our case it was the weekly seasonality, and by adding Fourier terms with annual frequency as exogenous variables, which are a mathematical representation of periodic functions using sine and cosine functions, we were able to capture the annual seasonality of our data.

TBATS Model

TBATS (Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend, and Seasonal components) is a time series forecasting model designed to handle complex and diverse patterns in time series. It was introduced as a flexible and robust approach, specifically designed to effectively capture multiple seasonality, long-term trends, and various irregular patterns.

Due to the observed limitations in the long-term forecasting performance of the SARIMAX model, coupled with the remarkable effectiveness demonstrated by the TBATS model, a hybrid approach was adopted. Specifically, the SARIMAX model was used for the first two weeks, while transitioning to the TBATS model for the subsequent weeks (from the second to the eighth). This strategic combination aimed to leverage the strengths of each model to enhance overall forecast accuracy.

Figure 1– Example of a time series where the predictions are generated using the SARIMAX model. WAPE = 30,79%

Figure 2 — Example of the same time series where the predictions are generated using the hybrid approach (SARIMAX + TBATS). WAPE = 7,94%

Here, a MAPE was calculated separately for each country (France, Spain, and Italy) for different periods of the year. This allows us to compare our machine-learning solution against the business solution (manual forecasting for the countries).

Table 1 shows the global comparison between the two methods based on the first 7 days

Our 7-day MAPE comparison (manual vs AI prediction) for all three countries during the periods analyzed for the 2022 prototype: on average, we were able to decrease Italy’s 7-day MAPE by 27,82%. For Spain, the MAPE decreased by 12,56% while for France, which had the best manual predictions of the three countries, we were able to match it (from 8,15% to 7,9%).

Table 2 shows the global comparison between the two methods based on the whole 56 days (8 weeks) during the same periods as Table 1

This has the potential to result in economic cost savings and a significant reduction in carbon footprint for the three countries combined because of their better allocation of human resources, transportation, and contact center costs based on the value analysis conducted with the logistics teams.

Technical architecture

Describing industrialization is a complex task beyond the scope of this article. Instead, we provide an overview of the solution architecture, starting with data sources in Decathlon’s data lake and ending with the exposition of predictions in the Glue catalog.

The final step involves using an SQL endpoint to display the predictions on a Tableau dashboard. We emphasize the importance of data management to ensure the reliability and accessibility of information, and model management to ensure the proper functioning and evolution of AI models.

Security measures are in place to ensure data integrity and confidentiality. Additional insights into the industrialization of this solution will be explored in a future post.

Exposition

The predictions are exposed via a Tableau dashboard that allows users to:

Access predictions of the daily number of e-commerce orders received for each country over a 9-week horizon.
Get an overview of the distribution of activity linked to the preparation of these future orders by the logistics warehouse.
Monitor a posteriori the quality and percentage of error rate of these predictions.

Check out this review by one of the users:

“The new automatic forecast is a precious and highly desired tool for online logistics. In fact, thanks to it we are able to predict and better manage our resources dedicated to online activities, as well as to offer a better service for preparing our orders and satisfying our customers even better.” (LOG Digital Project Leader)

Conclusion

While the solution already improves the metric for Spain and Italy, the solution could still be improved. Currently, the solution contains few external regressors (variables for prediction) to keep it simple and more scalable to other countries. For this reason, it doesn’t include any weather data and one-off commercial campaigns/etc: There is therefore always space for improvement. Potential future releases could include forecasts for bulky items, long-term forecasts (6–12 months), and so on.

In an upcoming blog post, we will share how we deployed this model at scale. Stay tuned!

A very warm thank you to the Decathlon AI-Lab and united e-commerce project team.