Making it Count: Probabilistic Inventory in Grocery

Published in

Afresh Engineering

11 min readApr 12, 2024

(Afresh patent pending technology)

We’ve all been to the grocery store, but how many of us know what it’s like to live a day in the life of a Department Manager?

These hard-working employees are the keystones of store operations. They start their shift long before dawn to complete a laundry list of tasks to get their department ready for the day: unloading delivery trucks, running product from the walk-in cooler to the sales floor, cutting fruit, vegetables, and meat for in-store prepared items, and inputting precise order quantities for each item in tomorrow’s delivery.

Traditional ordering is a tightrope balancing act between fully stocked shelves and wasted product — an act that can take years to perfect. And with grocery departments more short-staffed than ever before, ordering has become even more frenzied.

As a store approaches its opening hours, there is rarely time left to take stock of the current inventory levels for the items in their department. And yet, it’s critical for grocers to understand the inventory for multiple reasons:

Ordering: knowing inventory on hand is fundamental to deciding the order quantity, which is becoming increasingly automated through new methods such as computer-generated ordering (CGO).
Performance: inventory position is required to calculate key performance metrics such as shelf fullness levels and shrink (food waste)
Visibility: knowing inventory improves the online shopping experience by providing an accurate view of product availability.

Many other retail industries have addressed this problem with so-called “perpetual inventory” (PI) systems. However, perpetual inventory is ill-equipped to accurately estimate inventory in the grocery setting. In fresh grocery departments, in particular, the most important items can often be the hardest to track. Factors such as spoilage, mis-scans at the register, random weight item complexities, and in-store transformation (e.g. cutting tomatoes from Produce for sandwiches in Deli) contribute to high inventory variability and uncertainty. A perpetual inventory system relies on the data around these factors being perfect, or on Fresh Department Managers manually keeping track of all of these factors every day — neither of which are realistic. As a result, perpetual inventory estimates in fresh grocery are perpetually incorrect.

In this blog post, we’ll share how we at Afresh built a new inventory estimator (Inventory Hidden Markov Model, or InvHMM for short) to address these challenges of grocery inventory. We’ll illustrate that InvHMM not only improves inventory accuracy but also enables us to direct store teams to focus their attention on items where inventory updates matter most. We’ll also explain how InvHMM allows visibility into store conditions (e.g., shelf fullness) with less labor and more granularity.

Through streamlining labor, improving order accuracy, and increasing visibility into performance, InvHMM is a pathway to reliable automation tools for grocery.

Background: Perpetual Inventory (PI)

PI is a simple, industry-standard algorithm for estimating the current inventory. To estimate the current inventory:

Find the last inventory measurement, inv.
Find the total inflows (e.g. shipments) of the product since inv.
Find the total outflows (e.g. sales) of the product since inv.
Compute the estimate as est = inv + inflows — outflows.

Example: My associate reports that 5 bottles of ranch dressing were in the store Sunday morning. Since then, 3 bottles have been sold, and we received a new shipment of 6 bottles. My estimate for the current inventory is 5+6–3 = 8 bottles.

Overall, calculating PI is intuitive, and if all the inputs (inventory measurements, inflows, and outflows) are accurate, the result will be correct. But in grocery, these data inputs are frequently corrupted by factors such as unrecorded shrink and measurement error.

Unrecorded shrink

Shrink (the portion of inventory that is discarded, stolen, or otherwise lost) in many grocery departments is often as high as 10–15% of sales. Examples of shrink could be a mushy avocado discarded due to spoilage, or a stolen pack of organic ribeye steaks. Although some stores will track discarded items through “scanouts,” this practice of record keeping is neither thorough nor accurate enough to be the single source of truth for shrink. Most shrink is not captured by scanouts, and this will cause PI to overestimate the true inventory.

Measurement error

If inventory measurements are off, that will be fully reflected in the PI estimates. And in grocery, there are abundant sources of measurement error when it comes to inventory data:

Coarse-grainedness: measuring apple inventory as an integer (number of cases) will result in the assumption that each case contains ~100 apples. Over time, there will be substantial error if there are actually 123 apples in stock, and the inventory is rounded to 1 case.
Human error: a Produce Manager might forget to count inventory in the backroom, mix up two similar SKUs, fat-finger (e.g., “11” vs. “1”), or not see inventory hidden on the back of the shelf.
Data quality: inventory data can have ambiguous units of measurement (e.g., 1 bottle of salad dressing, vs. 1 case (6 bottles) of salad dressing) and unclear indexing (Was the inventory measured at the start of the day? End of the day?)

Example of measurement errors corrupting PI estimates (blue lines). Dashed line is the true inventory in the store. Blue dots are user inventory measurements.

With these challenges in mind, we set out to build an inventory estimation model tailored specifically for perishable grocery products.

Inventory hidden Markov model (InvHMM)

Note: this section is not necessary to read the results and conclusions, but provides methodological details.

Markov models are random processes that evolve from one state to another, where the next state only depends on the current state. For example, how much inventory is in the store tomorrow should depend only on today’s inventory (e.g. the quantity of each item, their expiration dates, etc.), sales, and shipments.

A hidden Markov model (HMM) is a Markov model that is partially and/or noisily observed. They are widely used in applications where the goal is to infer how some hidden state changed over time; e.g., inferring the coordinates of a Roomba moving around your living room, based on measurements taken by its sensors (learn more about HMMs here). In the Afresh model, we assume that user-reported inventory measurements are distributed around latent (i.e. ground truth, unobserved) inventory, with some noise.

Framed as an HMM, the inventory estimation problem is to infer the latent inventory X_t at some time t, given all the previous user inventory measurements Y_1, …, Y_t, as well as the sales and shipments history of the item.

A high-level schematic of the InvHMM probabilistic model. Gray nodes are observed (user-reported) inventory, white nodes are unobserved (ground-truth) inventory. Latent state transitions are random due to unobserved shrink and user inventories have random error.

Formal model definition

Latent inventory model: Let X_t be the latent inventory for an item on day t. We assume that X_t is determined by several factors — some deterministic and some random:

The daily outflows s_1, s_2, …, s_{t-1} on days leading up to t. These are primarily daily sales, but optionally, we can include scanouts (recorded shrink), if that data is available.
The daily shipments h_1, h_2, …, h_{t-1}
The (random) expiration dates of each daily shipment D_1, …, D_{t-1}
The (random) shrink “fraction” of each daily shipment F_1, …, F_{t-1}

Given these factors, the latent inventory is

where I is an indicator function. In the limit case of no shrink — e.g., $F = 0$ — then the model is equivalent to perpetual inventory, i.e. the cumulative shipments minus outflows (sales + recorded shrink).

User inventory model: We need to connect our latent variable (the ground-truth inventory) to data using a likelihood (aka emission probability) to estimate its posterior. We model the way the user takes inventory using a symmetrical distribution (e.g. Normal, Laplace) centered on the latent inventory; e.g.

where σ and ϵ can be interpreted as the relative standard deviation of the user counts and a minimum absolute standard deviation of the counts, respectively.

Algorithmic implementation

To solve this model, we use a stochastic algorithm called a particle filter. At a high level, we randomly sample the latent variables (D and F) starting from the first day (1) and moving forward until the target date (t). A particular sampling of D and F determines a unique trajectory of the inventory over time. In practice, we might take n = 100 trajectories.

On each day from 1, 2, …, t, we compute and update the weights of each trajectory using the user inventory model. Weights are proportional to the probability of the observed inventory, given a particular inventory trajectory. Once the weights become too imbalanced, we resample trajectories — selecting those with higher weight and pruning those with lower weight. As a result, on each day 1, 2, …, t, the resulting inventory trajectories give us an estimate of the posterior distribution P(X_t | Y_{t-1}, Y_{t-2}, …, Y_1).

Results: Improved inventory understanding

When we apply InvHMM to real datasets from our customers, we find that our model improves accuracy over perpetual inventory (PI) between 10–40%. (In other words, the mean squared error of a point estimate produced by InvHMM is 10–40% smaller than those from PI, benchmarked on a subset of reliable inventory measurements.)

Perishable inventory is more accurately estimated by InvHMM

By incorporating perishability (expiration dates & shrink fractions) into InvHMM, our inventory estimates don’t drift due to cumulative error like a PI estimate. This means that InvHMM estimates remain accurate for much longer than PI, especially on perishable items.

While PI overestimates perishable inventory, InvHMM estimates remain well-calibrated over time.

InvHMM is robust to inventory measurement errors

Another major source of improvement over PI is InvHMM’s robustness to errors in the user inventory measurements themselves. Perpetual inventory is extremely vulnerable to errors in this data because estimates snap to the last measurement taken, whereas InvHMM is more robust to errors in these measurements. For example, if the user has reported an inventory of 10 units several days in a row, and then they report an inventory of 20 units (despite no shipments or sales since the previous inventory), InvHMM can interpret that count as a likely error (and alert the user to correct it), where a PI system would simply accept the 20 unit input as truth.

While PI estimates are vulnerable to measurement errors (indicated with arrows), InvHMM is robust to these errors. Arrows are meant as a guide to the eye.

Quantifying uncertainty

InvHMM has an entirely new output compared to PI: confidence bounds on the inventory (see the shaded blue bands in the previous plots). These represent our model’s uncertainty in the true inventory value.

We find the “uncertainty” of InvHMM estimates — i.e., the width of the confidence bands — provides a highly-calibrated measure of the actual error of our estimates. That is to say: When we predict an estimate to have an expected error of ±1 unit, it actually has, on average, ±1 unit of error. So, in addition to InvHMM giving improved inventory accuracy over PI, it has the unique ability to anticipate which of its estimates are most risky. This is useful because one application of InvHMM estimates is to determine when order recommendations based on inventory estimates become excessively risky; see our later section on “Automated Ordering” for more.

The confidence band half-width (“uncertainty”) is a well-calibrated predictor of the InvHMM estimate accuracy, as measured by the difference between estimate vs. user inventory.

Current applications of InvHMM at Afresh

Beyond the transformative benefits of increased inventory accuracy, here are a couple of current applications of probabilistic inventory estimation by Afresh.

Automated ordering

To produce an accurate order recommendation, the ordering policy needs to know the current inventory to decide how much additional inventory is needed to reach the optimal inventory level. However, as previously discussed, taking inventory manually is a laborious process. InvHMM helps maintain accurate inventory over longer cycles for perishable products, allowing users to save time by taking inventory less frequently.

Additionally, what if we could target inventory measurements on a small number of items with inventory estimates that pose a risk to order accuracy? Afresh does this by measuring how inventory uncertainty propagates to uncertainty in the optimal order recommendation. For example:

Suppose I currently have between 50–150 pears. My ordering policy tells me I need 140 pears to satisfy demand. At the inventory lower bound, my order should be 140–50 = 90 pears; at the inventory upper bound, my order should be 0 pears (since current inventory exceeds 140). Thus, our recommendation for pears is at risk, because our uncertainty in the inventory is liable to cause a stockout. We mitigate this risk by prompting the Produce Manager to count inventory on these pears to increase our confidence in the inventory level; as such, we reduce recommendation uncertainty.
By contrast, suppose I have between 50–80 bags of pistachios, and I need 10 bags to satisfy demand. Pistachio inventory is similarly uncertain as that of pears in our former example. However, even if our pistachio inventory is at the lower bound (50 bags), we will recommend ordering 0. This suggests our recommendation for pistachios is not at risk.

Inventory uncertainty thus gives us a clear way to prioritize a subset of items that should receive inventory measurement to prevent stockouts and shrink.

Shelf fullness and stockout probability

Lastly, probabilistic estimates of the inventory can give us the probability an item will go out-of-stock (OOS) by the end of the day, or the expected fullness level (between 0% and 100%) of a given display. InvHMM estimates of OOS and shelf fullness illustrate cyclicality in shelf fullness and we can use these metrics to help grocers optimize merchandising: e-commerce customers find it frustrating when they order an item online only to find out at delivery it was out of stock. InvHMM can improve the e-commerce customer experience; e.g., by alerting customers when an item is predicted to have a high probability of OOS.

InvHMM-inferred shelf fullness levels across stores at one of our partners. Color bands illustrate shelf fullness levels, from out-of-stock (dark red) to full (pink).

Conclusion

In grocery, an accurate live understanding of inventory is critical for financial performance but is notoriously difficult to achieve. This is particularly true in fresh departments, where items are affected by additional shrink drivers like rapid perishability and in-transit damage, and by a higher likelihood of measurement errors.

Using InvHMM over traditional perpetual inventory leads to more accurate inventory estimates, increased resilience to measurement errors, and the ability to quantify inventory uncertainty. At Afresh, InvHMM allows us to streamline the ordering process and get newfound visibility into store performance.

Aaron Stern is an Applied Scientist and manager of the Ordering Efficiency team at Afresh. His team develops algorithms for estimating & targeting inventory on items throughout the grocery store.

To read more technical articles from the Afresh team, please visit our Medium page. Or, if you’re interested in joining the team, read about open roles here.