Machine Learning for Inventory Optimization

by Larry Snyder

Opex Analytics
The Opex Analytics Blog
6 min readMar 6, 2018

--

There is currently a lot of buzz about using machine learning (ML) techniques for predicting the future state of a supply chain (demand forecasting being the most popular use case). ML algorithms predict future behavior based on past occurrences and their associated environment. In this blog post, we aim to start a new kind of buzz by talking about using ML for prescribing how supply chains should operate (in order to achieve an optimal state). In short, we’ll use ML for prescriptive, rather than predictive, analytics.

We will use a case to highlight the application of ML for supply chain optimization. Imagine a supermarket that sells a certain brand of potato chips called Super Crispies. The demand for Super Crispies on a given day depends on a lot of factors, like the day of the week (demand is higher near the weekend), weather (demand is higher on hotter, sunnier days), and even macroeconomic factors like the stock market (Super Crispies are a high-end item, so demand is higher when the market is doing well). These are called features.

The supermarket has kept terrific data on historical sales of Super Crispies, including not only the demand, but also the values of the features, on each day. Here’s a snippet of the database:

Each case of unsold Super Crispies costs the supermarket $0.05 per day in holding costs, and each case of demand for Super Crispies that cannot be met because the supermarket has run out of inventory costs $0.70 in stockout costs.

Let’s say today is a Thursday with a high temperature of 80–84 degrees and sun, yesterday’s stock market was down 0.5%-1%, there’s no upcoming holiday, and there is a current promotion.

So how many cases of Super Crispies should the supermarket hold in inventory?

If we knew the probability distribution for the demand on a day with these particular values of the features — for example, a normal distribution with a mean of 9.4 and a standard deviation of 2.1 — then we could simply apply the newsvendor problem to determine the optimal safety-stock level.

Unfortunately, we don’t know this probability distribution. One way to get around this missing data is by fitting a distribution to the historical demands for days with the same features as today. For example, here is a histogram of demands on days with 80–84 degrees, sun, and so on:

The demands appear to follow a lognormal distribution; the best fit has parameters 𝜇 = 2.41 and 𝜎 = 0.29. This distribution is drawn on the plot, as well. Using this information, the newsvendor problem then tells us that the optimal safety stock level is 5.7 cases, for which the supermarket can expect to incur an average holding and stockout cost of $0.41 per day.

As you can see, the lognormal distribution is not a great fit, however. This is not surprising, given that we have only 14 total data points matching these values of the features.

Nevertheless, this is an intuitive and reasonable way to solve this problem. We call it separated estimation and optimization (SEO) — we first estimate the demand distribution, and then use that in an optimization model (the newsvendor problem) to solve the problem.

Even though the SEO approach is common and effective, we recently asked ourselves whether the current availability of richer data sources might suggest that machine learning (ML) could be a useful tool for optimizing inventory — of Super Crispies or any other product.

My colleague Martin Takáč, my Ph.D. student Afshin Oroojlooy, and I recently wrote a paper about the use of a specific branch of machine learning called deep neural networks (DNN), also known as deep learning, to solve problems like the one discussed above.

DNN tries to build a model that relates inputs (like temperature, day of the week, etc.) and outputs (like demand). Suppose the demand was a linear function of the temperature, like D = 0.1T. In this case, there’s no role for DNN — something much simpler, like linear regression, is all we need. But the demand for Super Crispies does not have such a simple relationship to the features. We can’t write a function that tells us D when we know the features. DNN has proven very effective in teasing out complicated relationships like this one. We won’t go into further details about deep learning here. There’s lots of information about it on the web however; here’s a good place to start.

Deep learning assesses the quality of a solution using a loss function, which measures how far the output is from its target. If the output is a demand prediction, then the loss function would measure how close the prediction was to the actual demands — sort of like a forecast error.

While DNN is well suited for predictive analytics tasks such as demand forecasting, it is less commonly used for prescriptive analytics — optimization — which is what we aimed to make it do in our paper. To do this, we moved away from the typical loss functions that measure distance from a target (because in optimization we don’t usually know what the target is), and instead used a loss function that measures the cost of a solution. In fact, our loss function is very similar to the newsvendor objective function.

In our paper, we test our DNN method for optimizing inventory levels by using a data set containing 13,170 demand records for 23 product categories from a supermarket retailer in 1997 and 1998 (available here). This data set isn’t too different from the fictional Super Crispies data set described above. The features in this data set are department, day of week, and month of year. We used our DNN method to optimize the inventory level for a range of holding cost values, keeping the shortage cost fixed. We also implemented several other ML algorithms that do not use deep learning. Here are the results:

The red curve plots the cost of the DNN method. The yellow curve is for the SEO method discussed above. The other curves represent other ML methods. As you can see, our method beats the other methods. It saves anywhere from 18% to 30% over the SEO method, depending on the holding/stockout cost ratio.

One downside of DNN is that it requires the network to be “trained” using lots of historical data. Even if that data is available (a big “if”), training takes lots of computational power — over 12 hours of computation time, in our case, using the state-of-the-art DNN code TensorFlow. On the other hand, once the model is trained, it can recommend new inventory levels in real time.

If your data comes from a single probability distribution, and you know that distribution and its parameters, then SEO or another simple approach is probably your best choice. But if your data set is noisy, as many real-world data sets are, and you have a rich history to work from, then DNN may be a great choice for optimizing your inventory. We think DNN and other ML methods will find more and more uses in the prescriptive analytics space in the near future.

(Our technical paper is available here.)

_________________________________________________________________

If you liked this blog post, check out more of our work, follow us on social media (Twitter, LinkedIn, and Facebook), or join us for our free monthly Academy webinars.

--

--