Pillars of Walmart’s Demand Forecasting

Published in

Walmart Global Tech Blog

4 min readAug 1, 2019

When we started our journey to build a demand forecasting product (a.k.a Smart Forecasting), we had the unique opportunity to build a system that can influence how our Business manages the demand for 500 Million Store-Item forecasts (across just US stores). Add to this the 11000+ stores across multiple markets & channels Walmart operates worldwide.

Generating forecasts is just the start of the journey. Getting the best forecasts that don’t need any manual intervention is the Nirvana state of our platform (hope we will reach that state soon 🙂).

Until then we need an application that allows users to adjust the forecasts.

Not that forecasts are bad but maybe the models did not consider some data points which are not in the system yet, like,

Local weather fluctuations
Customer demographics
Events near to a store
Any promotions planned by the business

This is as demanding (pun intended 🙂) an Engineering problem, as it is a Machine Learning one.

The success of the platform depends on a perfect blend of Business, Product, Engineering & Data Science talent to come together and work in tandem.

I want to highlight how we organize ourselves into different teams & focus on excelling in each of those individually, before delving into technical aspects of the solution (in subsequent posts).

Pillars

In any Data Science product, it is not just Data Science algorithms that make a product successful but equally important is the below functions,

Data Engineering
ML Engineering
Application Development
User Experience
Product Management

There is no denying that Data Science related Key Performance Indicators (KPIs) like accuracy in our case is the most important needle mover, but we cannot undermine the importance of the other areas.

Ex: An awesome algorithm just running in laptop & that does not scale, OR an algorithm that scales but does not meet the needed SLAs are just good in theory & can never make into production.

Data Science

The Brain behind what demand we forecast
The freedom of developing an algorithm using any library or language is very important.
It is important to note that there is no one algorithm that fits all needs
Ex: Time series algorithms does not very well predict all horizons of time accurately
How the underlying infrastructure is enabling the Data Scientists to experiment (at scale) more frequently will be the key to agility

Data Engineering

This is mandatory, especially when we have Large scale diverse data to deal with.

We have close to 100 TB of data & 10’s of data sources to tap from day one (and growing every day).

Interact with Walmart’s ocean of data spread across various systems and provisioning of the same for better collaboration
Foundational guarantees on the robustness of pipelines
Workflows are always time-bound with strict SLAs to downstream systems
Guaranteed freshness & quality of data

This indeed is the Backbone of the system and is pivotal in ensuring stability & accuracy of forecasts.

Demand Workbench (REST APIs & UI)

This is the Face of Smart Forecasting. Just having a forecast without any visibility to its end users, will not add much value.

This function becomes even more critical when we have 500 Million Store-Items to provide visibility across history & future.

Get visibility into various metrics influencing the demand like historical sales, waste, inventory, etc.
Metrics space comprising a minimum of 75 billion historical & 25 billion future data points.
Personalized to each user’s experience showing what really matters at a given time.
Allow users to manage demand by adjusting the forecasts.
These APIs also power some of the critical on-demand descriptive analytics applications in Supply Chain.

MLOps / ML Engineering

This is the Central Nervous System that controls the Brain (ML Algorithms) & Backbone (Data) of our product, taking care of many production aspects of our models.

At the confluence of the two pillars i.e. Data Science and Data Engineering.
Automate systems that will help propagate the machine learning models from development to production environment
Optimal scheduling of modeling workloads.
High availability & Failure handling in every part of the pipeline run
Build the ability to perform A/B testing of new models
Visibility into how any new feature in ML/Data affects the overall KPIs of the project (Accuracy, Bias, etc.)

DevOps

They instill blend of Service Engineering & Release Engineering best practices into our engineering teams.

Ensure the Infrastructure is up, running & monitored 24x7
High priority Alerts are notified to appropriate teams
Ensure we have proper quality checks in our build automation
Every part of the product is continuously tested out at different levels so that every commit meets the needed quality
Not just ensure you can develop & deploy features fast, but Mean Time To Recover (MTTR) should be low in failures
Look out for constant cost optimizations. Especially on Public cloud.
Infrastructure built around the agreeable measures around Recovery Time Objective (RTO) & Recover Point Objectives (RPO).

Product Management

Product management is where it all comes together

Always ensure that we keep our Business in the center of everything we do in the product
Facilitate with Business on what is the need of the hour
Ensure every feature added to the product provides incremental value
Voice of the Business & Champions of the product
Seek constant feedback, both from Business & other pillars

Hope you have got a better understanding of what teams constitute the basic building blocks of a successful ML product.

Through this & the series of other blog posts (I intend to write later), I am trying to highlight the key technical concepts which went a long way in making the product successful.