Unlocking Warehouse Efficiency: A Data-Driven Journey through Operational Statistics — Pt 1

5 min readFeb 22, 2024

This is part of a 4 part series on Operational Statistics :

Pt 1: Problem Background
Pt 2: Coefficient Estimation and Sample Size
Pt 3: Coefficient Estimation and Information Density
Pt 4: Coefficient Estimation with Constrained Optimization

Problem Background

At my current role I work a lot on operational analytics and algorithms. I recently received a problem where the business wanted to better forecast how long it takes to process particular items in our warehouse. This would help us better anticipate future hiring needs as well as help warehouse managers assess performance.

Processing tasks in our company span many activities — washing, photographing, disassembling, wrapping, transporting, etc. We effectively wanted to know the answer to the following question: how long would it take on average to wash a couch vs an armchair?

The problem was, we couldn’t measure it. The business currently tracked two things:

How long users were clocked into their stations (e.g. the Wash station)
The items the worked on during that time period (3 sofas, 2 nightstands, etc).

This sort of tracking setup is not uncommon in operations. You can imagine a similar setup in other applications such as call centers (total operator working time / number calls fielded), restaurant kitchens (shift time / number dishes made). Really anything where you have a shift time and discrete numbers of items/tasks that get accomplished during that shift time works here.

Getting back to time tracking, we actually had tried in the past to time-track each item an operator worked on — when the user started working on an item to when they finished working on it. However, for a myriad of reasons, the data we collected was fairly unintelligible. Operators had a hard time following standard practice in starting/stopping each item; scanning frequently was cumbersome and slowed down their work (so they avoided doing it); operators might multitask which made the start and stop times weird — for all sorts of reasons, tracking at this level of granularity wasn’t working.

However, that didn’t stop the business from wanting to know how long it took to process each item type.

The Old School Solution

Traditionally, to figure this out, a business might do a one-time exercise, timing a couple operators in the warehouse. They would pick operators that were well trained (important for consistent results), stand next to them with stopwatches, and record times for every item they worked on for a couple of hours.

This strategy is pretty intuitive and probably a good solution for many businesses. Relatively cheaply and quickly you can get guesses at the relative time it takes to handle different item types.

Unfortunately, this result suffers from a couple of drawbacks.

You may not have the resources to do a timing trial
Well-trained operators (which would be required to get good relative estimates) don’t necessarily represent the average operator
You probably only have time/resources to collect a small sample size
Operators, when their managers are watching them with a stopwatch, probably don’t act the same as when they are not supervised directly by their bosses

For the above reasons, a team implementing this strategy might fail to get good estimates, if they’re able to collect any at all.

A Data-Driven Solution

Another method, assuming you have some data, is to estimate the processing times using econometric methods, or really just a linear regression model.

You could imagine the scenario we’re talking about broken into a pretty simple linear equation, the below being the equation for linear regression:

The idea above is the total time someone spent at the wash station is the sum of all the items they washed (I) and the time it took to wash each one (𝛼).

From our tracking setup I have collected data on all of my I’s and T’s. With this information, I can use a machine learning model to help me estimate all of the 𝛼’s. If this doesn’t quite make sense yet, don’t worry, I will flesh this out soon!

Advantages over the Old School Method:

I don’t have to run a warehouse time trial at all. No disruption to warehouse activities
I can capture the times for the average operator rather than just my best one, making my results more generalizable to the business
Sample Size / Statistical Power is much greater — results are more accurate
I don’t inadvertently motivate my operators to work differently, since I’m not there in person

Linear Regression — The Intro to the Solution

As we saw in the equation I shared earlier we can model our problem as a linear regression problem, where the linear model’s coefficients are the item category times we’re interested in getting. This is a more econometric angle to model fitting than most data scientists are used to (see this article on the difference between data science and econometrics).

Let’s make this concrete with some code. Later articles will get much more in depth.

Let’s say I have a dataframe with the following format:

Where each row is a worker shift. total_time is the total time of that shift. The remaining numeric columns indicate the number of items in that category that worker processed on that shift. As you can see above, we have 20 of them here.

df = <that_dataframe_above>

# Fit Model
X = df.drop("total_time", axis = 1)
y = df["total_time"]

l = LinearRegression()
l.fit(X, y)

# Get Coefficients
estimated_coefficients = {
      k: v for k, v in zip(
        [int(c) for c in l.feature_names_in_], 
        l.coef_
      )
}

print(estimated_coefficients)

The above code can get us estimates of category times for each item (example output shown below.

Questions and Next Steps

If this sounds too simple, it’s because it kind of is. Notably, just because we have coefficients above doesn’t mean they are any good. Furthermore, we aren’t doing standard data science. We’re doing more econometric work — i.e. we’re evaluating coefficient accuracy, not really model accuracy. We don’t have a validation set to tell us our coefficients are good or not — since we have no idea what they are. We are relying on the model entirely to get coefficients, period.

How can we figure out how good our estimates are? Ops data usually is kind of small compared to ecommerce data. How do we even know we have enough data to try this method? Are there ways to mitigate not having as much data as we would like? These are questions for the next articles.

Pt 2: Coefficient Estimation and Sample Size
Pt 3: Coefficient Estimation and Information Density
Pt 4: Coefficient Estimation with Constrained Optimization

Stay tuned for the next articles, which will dive much more into the technical weeds!