TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Reduce Warehouse Space with the Pareto Principle using Python

8 min readApr 25, 2021

--

A warehouse storage system showing multiple racks filled with uniformly stacked pallets. The pallets are arranged in a vertical, three-level shelving unit, each containing boxes that are tightly packed. The shelving structure is made of metal frames, with green beams supporting the loads. This image represents an efficient use of warehouse space, visualizing the layout optimization process that could be improved using the Pareto Principle to focus on high-demand items.
Warehouse Racking Layout — (Image by Author)

Warehouse space is the largest fixed cost in logistics, making its optimization crucial for efficient operations.

Have you heard about the Pareto Principle?

Developed by Vilfredo Pareto to describe the distribution of wealth, this principle can be generalized to various applications, including logistics management.

You can significantly impact your operations by focusing on the top 20% of your references and picking locations.

As a data scientist, how can we use python to optimize a layout?

In this article, we’ll walk you through a real operational example of how to use Python to apply the Pareto Principle.

📫 For business inquiries, contact me: Samir Saci

Summary
I. Introduction of the Pareto Principle
II. Visualization of the Pareto Principle with Python
1. Data Processing using Pandas
2. Add Markers for 80/20
III. How to optimize warehouse space?
1. Grouping High Rotation SKU in Dedicated Picking Zones
2. Densify Picking Locations for Very Low rotations
3. Quick Example
IV. Conclusion
1. Generative AI: "The Supply Chain Analyst"
2. Product Segmentation for Retail

Introduction of the Pareto Principle

In 1906, an Italian economist named Vilfredo Pareto developed a mathematical formula to describe the distribution of wealth in Italy.

He discovered that 80% of the wealth belonged to 20% of the population.

“Black-and-white portrait of Vilfredo Pareto, an Italian economist known for developing the Pareto Principle. He has a full beard and mustache, neatly groomed, and wears a formal jacket over a high-collared shirt. Pareto’s contributions to economics are significant, with the 80/20 rule being widely applied in various fields, including logistics and warehouse optimization, where his principle helps improve operational efficiency by focusing on the most impactful items.
Vilfredo Pareto — Wikipedia (Link)

A few decades later, this rule has been generalized to many other applications, including Supply Chain and Logistics Management.

This principle, called the “Pareto Principle”, “the 80–20 rule”, or “The Law of Trivial Many and Critical Few”, can be translated for Logistics Practitioners

  • 80% of your company revenue is made from 20% of your reference
  • 80% of your volume is picked in 20% of your picking locations
  • 80% of your replenishment volume will be performed on 20% of your picking locations

In this article, we will explore how to apply this Pareto Principle using a real operational example

  • 1 month of picking up orders
  • 144,339 order lines
  • 59,372 orders
  • 4,864 active references

You can find the complete code in this Github repository 👇

If you prefer watching, have a look at the video version of this article

Visualization of the Pareto Principle with Python

Data Processing using Pandas

Let's assume we have a dataset of outbound orders with these four columns.

a. Import Libraries and Dataset

b. Calculate Volume Prepared per SKU (BOX)

To plot the Pareto graph, we need to

  • Sum the number of boxes picked per SKU
  • Sort your data frame by descending order on BOX quantity
  • Calculate the cumulative sum of BOX
  • Calculate the cumulative number of SKU

Results

In line 5, you can see that 0.1% of your SKUs represent 12.7% (20,987 Boxes).

What can we do with these insights?

🏫 Discover 70+ case studies using data analytics for supply chain continuous improvement 🚚 and business optimization 🏪 in this Cheat Sheet

Visualization of the Pareto Principle

Visualization

Using your processed data frame, let us plot (%BOX) = f(%SKU) to show the Pareto principle.

A line graph representing the cumulative sum of boxes picked as a percentage of total SKUs. The curve follows a steep initial increase before gradually flattening, illustrating the Pareto Principle with Python. The graph demonstrates how a small portion of SKUs accounts for a large percentage of total boxes, aligning with the 80/20 rule. No additional markers or thresholds are present in this graph, providing a clean visualization of cumulative data for logistics and supply chain analysis.
Visualization of Pareto Principle (II) — (Image by Author) [Tutorial]

Add Markers for 80/20

Marker 1: x = 20% of SKU (blue)
Marker 2: y = 80% of Boxes (red)

A line graph illustrating the cumulative sum of boxes as a percentage of SKUs, with two added markers showing the 80/20 rule. A vertical blue line marks 20% of SKUs, while a horizontal red line indicates 80% of boxes picked. The graph visually highlights that 80% of the volume is reached with less than 20% of the SKUs, reinforcing the Pareto Principle in warehouse space optimization. The chart offers a visual insight into SKU distribution efficiency using Python for logistics management.
Visualization of Pareto Principle (II) — (Image by Author)

Insights

We can see that the threshold of 80% volume is already reached before having 20% of SKU (sku_80 = 12.55%).

Try yourself to see how much %BOX represent 10% of the SKU picked.

Now that we have a pareto plot, what can we do?

How to optimize warehouse space?

How can we use these insights to increase your picking productivity and reduce space usage?

This approach aims to minimize the walking distance of picking operators by grouping items that may be ordered together.

Grouping High Rotation SKU

This Heatmap above is a 2D representation of the Pareto Principle that links each SKU with its picking location.

A heatmap displaying warehouse SKUs (A01 to A16) across various picking locations. The color gradient, from green to red, indicates the percentage of items picked per location, with darker green representing high-pick locations and red indicating low-pick locations. The heatmap visualizes the Pareto Principle with Python for logistics, where a small number of picking locations accounts for a large portion of picks, allowing warehouse operators to optimize routes and improve efficiency.
Example of a Warehouse Heatmap based on % quantity picked — (Image by Author)

As the heatmap shows, by grouping less than 10 locations, we can accumulate nearly 20% of the volume.

This would reduce the length of the picking routes.

What about replenishment?

Densify Picking Locations for Very Low rotations

What is a replenishment task?

Level 1 is the Picking Location on the ground level, where the Warehouse Picker takes boxes to prepare orders.

A vertical warehouse rack containing five levels, with pallets of boxes stored at each level. The bottom two levels are designated as picking locations, while the top levels are for storage. Arrows indicate a flow from storage to picking locations, illustrating the replenishment process in the warehouse. This setup visualizes the space optimization by using multiple levels for pallet storage and retrieval in an efficient manner.
Example of Full Pallet Picking Location with 4 levels of storage — (Image by Author)

When the quantity level in your picking location is below a certain threshold, your WMS will trigger a Replenishment Task.

  • Take a pallet from the storage level (level 3) and put it in the picking location (level 1).

How does the Pareto Principle impact your picking location layout?

A warehouse storage rack showing multiple levels of neatly stacked pallets, with boxes placed on all five levels of shelving. Each pallet contains identical boxes, representing the storage and picking arrangement for different SKUs. This image demonstrates the efficient use of vertical space in the warehouse and highlights how pallet locations can be optimized based on SKU rotation and demand using Python.
Full Pallet (Left) | Half Pallet (Middle) | Shelves (Right) Picking Location with 3 levels of storage — (Image by Author)

The Full Pallet Location type considers the floor pallet location per SKU.

However, we can increase the density of locations by using

  • Half Pallet Locations: 2 SKU per floor pallet location
  • Shelves Locations: 9/2 SKU per floor pallet location

We need to compromise: Surface Optimizations vs. Number of Replenishment Moves

A significant issue with half pallets and shelves is the limited storage capacity vs. full pallets.

For the same quantity picked per month, a half pallet provides two times more replenishment and even more with shelves.

Can we automatically find the optimal balance?

Using the Pareto principle and SKU rotations analysis, we can find the best compromise when choosing the location type using the rules below.

  • Full Pallet/Half Pallet Locations: only for high-runners (Top 20%)
  • Shelves Locations: for the 80% low runners making only 20% of your volume

These thresholds have to be adapted to the specificities of your warehousing operations

  • Workforce hourly costs (Euros/Hour) and your productivity for Picking (Lines/Hour) and Replenishment (Moves/Hour)
  • Warehouse rental costs (Euros/Sqm/Month)
  • Dimensions of your boxes(Width (mm) x Height (mm) x Length (mm)) that will drive your different picking locations' storage capacity

The target is to find the best compromise between high replenishment productivity (Full Pallets) and reduced ground surface occupation (Shelves).

Let’s have a look at a simple example.

Pallet dimension: 0.8 x 0.12 (m x m)
Alley width: 3.05 (m)
Dx: 0.1 (m) Distance between two pallets
Dy: 0.15 (m)
Ground surface occupied(including alley)
Full Pallet = (0,8 + 0,1) x (1,2 + 0,15 + 3,05/2) = 2,5875 (m2)
Half Pallet = 2,5875 / 2 = 1,29375 (m2)
Shelves = 2 x 2,5875/9 = 0,575 (m2)
Warehouse Rental Cost (Jiaxing, China)
C_rent = 26,66 (Rmb/Sqm/Month) = 3,45 (Euros/Sqm/Month)
Forklift Driver Hourly Cost
C_driv = 29 (Rmb/Hour) = 3,76 (Euros/Hour)
Replenishment Productivities
Full Pallet: 15 (Moves/Hour)
Half Pallet: 13 (Moves/Hour)
Shelve : 3,2 (Moves/Hour)
Picking Location Capacity
Full Pallet: 30 (Boxes)
Half Pallet: 15 (Boxes)
Shelve : 4 (Boxes)

We compare the three picking location types using very high, high and low rotations SKUs.

Example 1: Very High Rotation

A cost comparison table for SKU 359803, showing the differences in cost for full pallet, half pallet, and shelf storage methods. The table compares the number of boxes, moves, workload time, workforce costs, rental costs, and total cost per month. Full pallets have the lowest total cost, while shelves incur significantly higher costs due to increased workforce and replenishment needs, with a 2698% increase compared to full pallets.
Layout Design for Very High Rotation SKU — (Image by Author)

Conclusion
Full Pallet Location is the best solution

Example 2: High Rotation

A cost comparison table for SKU 363345, displaying the storage cost differences between full pallet, half pallet, and shelf locations. The table compares boxes, moves, workload time, workforce costs, rental costs, and total cost per month. Full pallets are 41% cheaper than shelves, while half pallets offer a middle ground with reduced total costs. Shelves are less efficient for high-rotation SKUs due to their higher workforce and replenishment needs.
Layout Design for High Rotation SKU — (Image by Author)

Conclusion
Half Pallet Location is the best solution

Example 3: Low Rotation

A cost comparison table for SKU 319476, comparing full pallet, half pallet, and shelf storage options for very low rotation SKUs. The table lists boxes, moves, workload time, workforce costs, rental costs, and total monthly costs. Shelves are 65% cheaper than full pallets for low-rotation SKUs, as they require minimal moves and workforce effort, making them the most cost-effective solution for storing low-demand items.
Layout Design for Low Rotation SKU — (Image by Author)

Conclusion
Shelf Location is the best solution

Here is an empirical example of how a pareto analysis can help us optimize a warehouse layout.

Conclusion

Here, we present a simple methodology for visualising and applying the Pareto Principle to your warehouse picking order profile to estimate the potential for optimisation.

What’s next? Order grouping.

Based on your optimized layout, you can build a simulation model to estimate the impact of several Single Picker Routing Problem strategies on your Picking Productivity.

A simple warehouse aisle layout for order picking. The aisles are labeled from A01 to A19, with racks on both sides. There is a start point marked at the bottom left. The image focuses on a simulation of an order picking process, demonstrating a route taken by a picker as they navigate through the aisles to complete their task. The visual aims to help analyze the effectiveness of different picking routes within a warehouse.
Scenario 1: Picking routes with one order picked per wave — (Image by Author)

This is precisely what I did in this series of articles in which I tested several order batching strategies to minimize operators’ walking distance.

Have a look here 👇

Are there other applications of the pareto principle?

Beyond warehouse layout optimization, this principle can categorize items based on importance.

Product Segmentation for Retail

Let’s assume that you have 50,000 unique references in your portfolio.

Can you allocate the same amount of resources to manage each of them?

Of course not.

An item requires demand forecasting, inventory management, and delivery monitoring.

Product segmentation refers to the activity of grouping products that have similar characteristics and serve a similar market.

A matrix for statistical product segmentation, which categorizes products based on demand variability (vertical axis) and economic value classification (horizontal axis). Products are classified into three groups: “High Importance” with high demand variability and high economic value (A category), “Stable Demand” with low demand variability (B category), and Low Importance with low economic value and low demand variability (C category). It highlights the need to focus on high-importance products
Example of Product Segmentation for Retail — (Image by Author)

In the example above, we apply ABC analysis to categorize your items based on

  • Their contribution to the turnover (A, B and C) ;
  • The variability of their demand ;

The objective is to allocate resources to implementing advanced monitoring for important items with unstable demand.

What are the applications for this?

  • Inventory Management and Demand Forecasting
  • Customer service and retail operations

If you want to implement this segmentation, look at this article.

About Me

Let’s connect on Linkedin and Twitter. I am a Supply Chain Engineer who uses data analytics to improve logistics operations and reduce costs.

For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.

If you are interested in Data Analytics and Supply Chain, look at my website.

💌 New articles straight in your inbox for free: Newsletter
📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Samir Saci
Samir Saci

Written by Samir Saci

Top Supply Chain Analytics Writer — Case studies using Data Science for Supply Chain Sustainability 🌳 and Productivity: https://bit.ly/supply-chain-cheat