TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Lean Six Sigma with Python — Logistic Regression

Perform a Logistic Regression to estimate the minimum bonus needed to reach 75% of a productivity target.

Samir Saci
TDS Archive
Published in
6 min readAug 31, 2021

--

A diagram illustrating the “Minimum Bonus Problem.” The title asks, “What is the minimum bonus level needed to reach 75% of your productivity target?” Three rows show warehouse operators being randomly selected, each row with a worker icon holding a money bag. Arrows point from these workers toward stopwatch icons labeled “Productivity Target?”. Use of Kruskal Wallis Test with Python to evaluate how different bonus levels impact operator productivity, specifically aiming to achieve the target.
Minimum Bonus Problem- — (Image by Author)

Lean Six Sigma is a stepwise approach to process improvements that uses statistical methods to validate hypotheses.

In a previous article, we used the Kruskal-Wallis Test to verify the hypothesis that specific training positively impacts operators' Inbound VAS productivity.

Can we use Python to prove that bonuses can improve operators’ productivity?

In this article, we will implement Logistic Regression with Python to estimate the impact of a daily productivity bonus on your warehouse operators' picking productivity.

SUMMARY
I. Improve Warehouse Productivity with Lean Six Sigma
What should be the minimum amount of daily incentive to get 75% of workers that reach their productivity target?
II. Data Analysis
1. Exploratory Data Analysis

Analysis with Python sample data from experiment
2. Fitted Line Plot of your Logistic Regression
What is the probability of reaching the target for each value of daily incentive?
3. Validation with the p-value
Validate that your results are significant and not due to random fluctuation
III. Conclusion
1. Other Lean Six Sigma Tools
2. Uncovering Inefficiencies with Process Mining
3. Automate Productivity Data Collection and Processing

Improve Warehouse Productivity with Lean Six Sigma

Scenario

You are working for the Regional Director of a Logistic Company (3PL) and have 22 warehouses in your scope.

A 3D warehouse layout model showing a logistics fulfillment center. The scene includes shelves stacked with boxes, a conveyor belt system transporting yellow parcels, and two warehouse workers operating nearby. The workspace has packing stations and desks in the foreground. The purpose is to demonstrate a typical workflow in a distribution center, focusing on picking and packing tasks. This image supports discussions on improving warehouse productivity using incentives and Lean Six Sigma methods
Fulfilment Centers with Picking, VAS and Packing — (CAD Model by Author)

In each warehouse, the site manager has fixed a picking productivity target for the operators.

The objective is to find the right incentive policy to reach 75% of this target.

As a data scientist, can we help her to find it?

Find the right incentive policy.

Currently, productive operators (operators that reach their daily productivity target) receive 5 euros per day in addition to their daily salary of 64 euros.

Is it an efficient incentive? What’s the ROI?

However, this incentive policy in 2 warehouses is ineffective.

Only 20% of the operators are reaching this target.

What minimum daily bonus should be needed to reach 75% of the picking productivity target?

Experiment

  1. Randomly select operators in your 22 warehouses
  2. Implement a daily incentive amount varying between 1 to 20 euros
  3. Check if the operators reached their target

Let’s have a look at the data now.

If you prefer to watch, have a look at the video version of this article

Impact of Incentives on Operators’ Productivity

Exploratory Data Analysis

This dataset shows the incentive amount and a boolean value that informs whether the operator reached the target.

Can we plot this?

Box plot of the sample distribution

Box plot of Incentive distribution by target — (Image by Author)

The median incentive value for the day the target is reached is more than two times higher than the one for the days below this target.

Have you heard about Logistic Regression?

Fitted Line Plot of your Logistic Regression

Logistic Regression will provide us with a probability plot.

We can estimate the probability of reaching the target for each value of the daily incentive.

Fitted line plot of your sample data — (Image by Author)
  • Confirmation of the current trend
    5 euros, we reached 20% of the productivity target reached.
  • We need at least 15 euros incentive per day to ensure a 75% probability of reaching the target.

Code

Minitab
Menu Stat > Binary Fitted Line Plot

Is it statistically significant?

Validation with the p-value

We need to compute the p-value to check that these results are significant based on sample data.

p-value: 2.1327739857133364e-141
p-value < 5%

The p-value is below 5%, so the mean difference is statistically significant.

Conclusion
If you fix the value of incentives at 15 euros per day, you will reach 75% of your target.

Code

Minitab
Menu Stat > Binary Fitted Line Plot

You can find the complete code in this GitHub repository

🏫 Discover 70+ case studies using analytics for supply chain optimization 🚚, sustainability🌳and business optimization 🏪: Cheat Sheet

Conclusion

Based on this experiment, we have fixed the bonus incentives at a minimum of 15 euros/day to reach 75% of your productivity target.

What is the Return of Investments?

Before implementing this new incentive policy, you need to check that you have a positive return on investment:

  • What is the total cost to the company (CTC)
  • What is the total amount of hours earned after the productivity increase? (Hours)

After answering these questions, you can estimate the return on investment of this new incentive policy.

Do we have other statistical tools in our Python toolbox?

You can look at the articles below if you are interested in other Lean Six Sigma Methodology applications using Python.

Are we sure that this incentive is the only way to improve productivity?

Uncovering Inefficiencies with Process Mining

While they certainly play a role, they may not address underlying inefficiencies in your processes.

This is where understanding and analyzing your processes become crucial.

What if the bottlenecks lie within the processes themselves?

In a related article, “What is Process Mining?” we discover the power of process mining.

This technique uncovers hidden inefficiencies and opportunities for improvement within business processes.

A simplified diagram visualizing different components involved in a supply chain process, including transportation, warehousing, and order management. The icons represent key systems such as a warehouse management system, transportation system, and data analytics tools used for process mining. The image highlights how different operational systems interact to optimize supply chain efficiency.
Example of Supply Chain Information Systems for Process Mining — (Image by Author)

It uses the data generated by systems to gain valuable insights into your operations.

Can we find the root cause of productivity slowdowns?

This can help streamline workflows and enhance productivity without solely relying on incentives.

For more information,

How do you collect productivity data?

Automate Data Collection from Excel files using Python

During my four years as a supply chain solution manager, I have spent much time collecting and processing unstructured data.

For instance, productivity reports are usually in multiple Excel files, requiring manual collection and processing to generate a productivity report.

Can we automate this process? Yes!

In this article, I introduce a Python workflow to extract and process accounting data from Excel files automatically.

A screenshot of a detailed cost breakdown table in an Excel format, titled “Operations Costs,” showing monthly costs from 01/08/2017 to 31/08/2017. The table lists different equipment types such as trucks, batteries, and electronic devices, along with their associated costs for renting, investments, depreciation, and purchasing. Columns include rental units, unit cost, depreciation, and VAT percentage. The purpose of this data is to support the automation of Excel data extraction using Python.
Examples of input files — (Image by Author)

These files respect a specific format, like your productivity files, which the script uses to extract the correct information.

The workflow is completely automated, and an executable file is used to share the tool with users who can’t use Python.

For more information about this automation tool, check this article.

About Me

Let’s connect on Linkedin and Twitter. I am a Supply Chain Engineer who uses data analytics to improve logistics operations and reduce costs.

For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.

If you are interested in Data Analytics and Supply Chain, look at my website.

💌 New articles straight in your inbox for free: Newsletter
📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet

--

--

No responses yet