TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial…

Supply Chain Sustainability Reporting with Python

Samir Saci
TDS Archive
Published in
11 min readDec 3, 2021

--

The image shows a supply chain sustainability report illustrating CO2 emissions from different transportation methods. A distribution center supplies goods to customers through three routes: (1) via road (120 km), air (1450 km), and road again (700 km); (2) directly by road for 200 km; and (3) through road, sea, and road again. Each mode of transportation highlights the distances, representing the CO2 emissions linked to the transportation network’s sustainability calculated with Python.
Supply Chain Sustainability Reporting — (Image by Author)

The demand for transparency in sustainable development from investors and customers has grown.

Investors have increasingly emphasised business sustainability when assessing an organization's value and resiliency.

How can you support the sustainable transformation of your company with Data Science?

Therefore, more organizations are investing resources to build capabilities for sustainability reporting and determine the best strategies for a sustainable supply chain.

We can automate Sustainability Reporting with Python.

In this article, we will introduce a simple methodology to report the CO2 emissions of your Distribution Network using Python and PowerBI.

SUMMARYI. How to create a Supply Chain Sustainability report?
Calculate the CO2 emissions of multi-modal transportation ?
II. Data Processing with Python
List of the datasets needed to build your report
1. Import Shipped Order Lines
All the purchased orders of your customers that have been shipped
2. Add Unit of Measure Conversions
Merge with Master Data to convert quantities in weight
3. Distances Collections
Collect the distances by transportation mode
III. CO2 Calculation
1. Final Calculation of the CO2 Emissions
2. What can you do if you have missing data?
VI. Example of visualization using PowerBI
V. Conclusion
What the next step?
1. Transportation Route Visualization with Python
Provide insights to support the decarbonization of transportation.
2. Simulate different initiatives with a Digital Twin
Answer questions like: "What if you deliver only large customers?"
3. Have you heard about ESG Reporting?
What are the benefits for the company in terms of compliance?
4. Expand to Warehouse Operations
Cover the entire distribution chain with initiative to reduce packaging material usage
5. Automatically Generate PowerPoint Report with Python
Create PowerPoint slides with visuals and comments

How do you create a Supply Chain Sustainability report?

Definition of CO2 emissions by Scope

Based on the GHG Protocol corporate standard, greenhouse gas emissions are classified into three scopes:

  • Scope 1: direct emissions released into the atmosphere because of the company’s activities (Company’s facilities like manufacturing plant/warehouses, company’s vehicles)
  • Scope 2: indirect emissions from the generation of purchased energy
    (purchased electricity, gas, ..)
  • Scope 3: all indirect emissions (out of scope 2) occurring in the value chain of the company (Transportation, Waste of Operations, Business Travels, …)

In this article, we will focus on the Scope 3 calculations related to downstream transportation.

What is the environmental impact of your distribution network?

The Formula of CO2 emissions using Emissions Factors

A mathematical formula to calculate CO2 emissions based on emissions factors. The formula is structured as follows: “CO2 Emissions = Distance × Weight × Emission Factor.” This equation calculates the carbon dioxide emissions by multiplying the distance traveled by the weight of the goods transported and the emission factor (representing the rate of emissions per unit of weight and distance). The formula is used in the context of transportation-related emissions calculations.
Formula using Emission Factor — (Image by Author)
With,E_CO2: emissions in kilograms of CO2 equivalent (kgCO2eq)
W_goods: weight of the goods (Ton)
D: distance from your warehouse to the final destination(km)
F_mode: emissions factor for each transportation mode (kgCO2eq/t.km)

This formula provides a gross estimation of the CO2 emissions without requiring a high granularity of transportation data.

A more accurate approach would be to estimate the CO2 emissions of each delivery, considering the vehicle model (truck, container carrier, plane, or train) and the filling rate.

Calculating Supply Chain Carbon Footprint with Python

We can now collect data to calculate the emissions based on this formula.

The image shows a data model for calculating supply chain CO2 emissions. “Master Data” includes item details like net weight. “Shipped Order Lines” contains shipment info (order number, warehouse, customer). “Business Units” holds warehouse data, while “Address Book” lists customer locations. “Distance by Mode” records transport distances (road, sea, air, rail) between warehouses and customers, used for CO2 emission calculations based on shipment and distance data.
Data Collection — (Image by Author)

Import Shipped Order Lines

Let‘s start by extracting from our ERP (or WMS) the shipped order lines: all purchased orders of your customers that have been shipped from your warehouses.

This dataset includes

  • Order information: Order Number, Line Number
  • Item Code: Identification number of the item in the master data
  • Warehouse Code: the business unit from where these orders are shipped (Distribution Center, Factory)
  • Customer Code: the final destination where the orders are delivered (Store, Distribution Center)

Code

Result

The image shows a table displaying order data for calculating CO2 emissions. It lists 5,288 order lines with columns for Date, Month-Year, Warehouse Code, Customer Code, Order Number, Order Line Number, Item Code, Units, and Euros. This data is used in the context of supply chain sustainability reporting to analyze emissions based on shipped orders from various warehouses to customers.
Shipped order lines data frame — (Image by Author)

Add Unit of Measure Conversions

The next step is to convert the quantities ordered into weight (kg).

Net Weight vs. Total Weight
Before discussing the calculation's details, we must explain the difference between gross and net weight.

A diagram shows a watch next to its packaging and their combined weight. It represents the need to calculate both the product’s and its packaging’s weight for shipping. The image demonstrates that the total weight equals the sum of the product’s net weight and the packaging weight. This concept is vital for accurate logistics calculations when converting units into transportation weight for shipment.
Total Weight including packaging and the product — (Image by Author)

Packaging is the container used to cover your finished product. In some ERP master data, you may find the net weight (without packaging) and the gross weight (with packaging).

For this report, we need to take the gross weight to estimate the total weight, including packaging.

Handling Unit (Cartons, Pallet)
Depending on the order quantity, your customer can order by unit, carton (grouping several units), or pallet (grouping several cartons).

This diagram shows the lifecycle of a product (watch) from individual units to carton packaging and then palletization for shipping. It highlights different weights calculated at each stage: unit weight, carton weight, and pallet weight. It also shows how full pallets and mixed cartons are palletized for shipment, with weight references for each step: units, cartons, and full pallets. This is crucial for supply chain weight conversion in logistics.
Total Weight including cartons and pallets — (Image by Author)

If you are lucky enough to have the weights of cartons or pallets, you can use them if your customer is ordering full cases or full pallets.

Assumption of Mixed Cartons
For certain logistic operations where you must perform piece picking (Luxury, Cosmetics, E-Commerce), the quantity per order line is so low that you rarely ship full cartons.

A flowchart shows three types of order packaging: full pallets, cartons, and individual units. Each order type follows a distinct path for packaging and palletization. For full pallets, the weight reference is the pallet; for cartons, it’s the carton, and for individual units, it’s converted into weight after being packed and palletized. This diagram visualizes how different order types are handled in supply chain processes, with weight reference at each stage of transportation.
Weight Reference based on the handling unit for the shipping order— (Image by Author)

In this case, there is no point in using full carton weight.

We can only rely on the total weight per unit.

For our example, we will assume that we are in this situation.

Code

Results

A table with columns such as Date, Warehouse Code, Customer Code, Order Number, Item Code, Units, Euros, and Conversion Ratio. The data represents 5,268 order lines, showcasing how each unit conversion is processed in a supply chain context. The conversion ratio calculates the weight of each unit in relation to the product ordered, essential for managing logistics weight and costs. The dataset includes information such as warehouse, customer, and order details.
Shipped order lines with conversion ratios — (Image by Author)

Distances Collections and GPS Locations

We need to collect the distance by mode:

  • Air Freight
  • Sea Freight
  • Road Transportation: Trucks
  • Rail Transportation: Trains
Formula using Emission Factor — (Image by Author)

We will also add the GPS locations of the destination for our PowerBI reporting.

A table showing a dataset of 5,208 shipped order lines. The table includes columns for warehouse code, customer code, order number, item code, transportation modes (road, rail, sea, air), and delivery location (with GPS coordinates). Each row represents an individual order line, including key data such as item weight in kilograms (KG). The dataset provides insights into shipping operations and delivery details, including the mode of transportation used for each order.
Final DataFrame after processing — (Image by Author)

Code

🏫 Discover 70+ case studies using data analytics for supply chain sustainability🌳and business optimization 🏪 in this: Cheat Sheet

CO2 Emissions Calculations using Emissions Factors

Final Calculation of the CO2 Emissions

We now have all the information to be gathered in a single data frame.

We can start calculating the CO2 emissions using emissions factors associated with your transportation network.

Sum the weight by Order.
For reporting purposes, let us calculate the CO2 emissions for each order number (linked with a customer and a date).

Final Results

A dataset of shipped orders displayed in a table, with columns including warehouse code, customer code, order number, item code, transportation mode (road, rail, sea, air), and GPS coordinates of the delivery location. The table highlights data used for sustainability calculations, such as item weight and the transportation method. The image shows a slice of the data that helps in calculating carbon emissions for transportation activities.
Final Report of CO2 Emissions by Shipping Line — (Image by Author)

What if you don’t have 100% of the distance?

A significant challenge here is to get the distances if you have several thousand delivery locations.

If you are not able to collect 100% of the distance from your carriers, you can:

  • Get the road distances using Google Maps API / OR
  • Estimate the distance using the Haversine formula / OR
  • Compute the average distance of your top customers (in Euros) and apply the average to the locations without distances

What if you don’t have 100% of the weight conversions?

Sometimes, the master data is not updated, and you cannot get the unit-of-measure conversions for all the items.

In that case, you can

  1. Focus your data collection efforts on the high runners (turnover)
  2. Estimate the weight per euro for these items (kg/euros)
  3. Apply the average ratio on the c remaining items without conversions to get their weight

Please find the complete code with dummy Data in my GitHub repository:

💡 Follow me on Medium for more articles related to 🏭 Supply Chain Analytics, 🌳 Sustainability and 🕜 Productivity.

Example of visualizations using PowerBI

Bubble map with size = f(CO2 Total)

A map visualization showing various cities in Europe, marked with blue bubbles. The size of the bubbles represents the total CO2 emissions. Locations include cities in the UK, France, and Germany, illustrating CO2 emissions across different regions of Europe.
Bubble Map of the CO2 emissions by locations— (Image by Author)

Visual Insights
You can observe where you have the majority of CO2 emissions (large bubbles) with a colour coding by transportation mode.

Split by Country Destination and Item Code

A bar chart with horizontal bars representing the total CO2 emissions by customer country. Germany has the highest emissions, followed by the United Kingdom, France, Bulgaria, and Mauritania.
Bar Chart — (Image by Author)

Product Portfolio Insights
For each market, which item has the highest environmental impact?

CO2 = f(Turnover) by City Destination

A scatter plot comparing total CO2 emissions (Y-axis) with revenue in Euros (X-axis) across various cities. Cities such as Moissy-Cramayel, Belleville, Metz, and Bristol are highlighted, showing the relationship between revenue and CO2 emissions, with Moissy-Cramayel standing out with high revenue and emissions.
Scatter Plot — (Image by Author)

Financial Insights
Your future efforts to reduce CO2 emissions will probably have a higher impact on the profitability of the PEINE-WOLTORF customers.

Next Steps

You have now generated a simple dashboard to visualize the emissions of your distribution network with different granularities.

You’re going to be audited; be prepared!

Your results will be used for non-financial reporting, so your data extraction and processing process may be audited internally or by external actors.

What’s next?

Provide insights to support the decarbonization of your logistic operations.

Visualization of your Transportation Network

After measuring your emissions, a potential next step could be to analyze your Distribution Network.

A map of Eastern China displaying transportation routes as part of a supply chain distribution network. The map shows 14 routes dated 2016–09–01, represented by lines connecting various cities including Hangzhou, Shanghai, and Nanjing. The purpose is to visualize transportation paths within the network, providing insights for optimizing and decarbonizing logistic operations. The lines indicate different transportation modes across key urban hubs.
Transportation Routes from a Warehouse in Shanghai — (Image by Author)

You want to look for opportunities to

  • Maximize the filling rate of trucks (% of the volume occupied)
  • Combine shipments to reduce the number of trucks in your fleet
  • Minimize the empty run distance (backhaul)

You can support this process with Python.

For more details, have a look at this article 👇

The reengineering team has drafted multiple initiatives to reduce the carbon footprint.

They require your support to estimate their impact.

Let’s simulate several ‘What if’ scenarios.

Simulate different initiatives with a Digital Twin

A digital twin is a digital replica of a physical object or system.

A Supply Chain digital twin is a computer model representing various components and processes involved in the supply chain.

“A supply chain digital twin simulation flowchart depicting various supply chain components. Icons of factories, trucks, warehouses, and retail stores are connected. Python symbols are associated with each step, representing different stages from production, warehousing, transportation, to store replenishment. Historical sales data flows from stores to generate replenishment orders, simulating different scenarios and their impact on CO2 emissions, service levels, and costs.
Supply Chain Digital Twin Model with Python — (Image by Author)

After you have measured your baseline of CO2 emissions and fixed your reduction target, you can start building a roadmap.

What do we want to achieve?

For each initiative, you can use this model to estimate the impact on the

  • CO2 emissions: % of reduction
  • Service Level: delivery lead time(s)
  • Costs: additional CAPEX and operational costs
Illustration of different production initiatives in a manufacturing setting, showing a factory icon on the left and icons representing questions, raw material supply, cost-saving measures, packaging, and quality control on the right. These initiatives are part of a broader supply chain sustainability strategy aimed at reducing CO2 emissions, improving service levels, and optimizing costs through digital twin simulations.
Example of component of your digital twin — (Image by Author)

And validate the initiatives that reduce emissions without impacting the service level while respecting budget constraints.

What is the most effective initiative?

Initiative 1: you want to set up local warehouses and use electric vehicles for the last-mile delivery

  • What would be the impact on the transportation costs?
  • What would be the impact of warehousing costs (with more locations)?
  • How much CO2 emissions reduction can we reach?

Initiative 2: you would like to stop air freight to reduce your CO2 emissions

  • What would be the impact on the stores’ replenishment lead times?
  • How far ahead do the distribution planners need to create replenishment orders?
  • What would the impact be on the reduction of CO2 emissions?

Initiative 3: you want to set up additional factories to produce locally for all markets

  • What would be the impact on the production costs?
  • What would be the impact of transportation costs (with factories close to the warehouse locations)?
  • How much CO2 emissions reduction can we reach?

Adapt your model to include each initiative and see how much your service level is impacted.

Then, you can adapt the other metric (e.g., improving warehouse capacity, adding trucks) until you restore the service level.

For more details on how to implement it, 👇

The top management is ready to invest in your initiatives, but they want to know the company's return on investment.

Have you heard about ESG reporting?

Expand to ESG Reporting

Environmental, Social, and Governance (ESG) reporting can be defined as a method by which corporations disclose their governance structures, societal impacts, and ecological footprint to shareholders.

An illustration of ESG (Environmental, Social, Governance) principles displayed in three house-shaped sections. The Environmental section includes carbon footprint reduction, climate change strategy, waste reduction, and energy efficiency. The Social section focuses on fair wages, equal job opportunity, health and safety, responsible suppliers, and labor law compliance. The Governance section highlights corporate governance, risk management, ethical business practices, and transparency.
ESG Pillars — (Image by Author)

CO2 emissions reports are included in environmental reporting (E).

This can be completed with a whole life cycle assessment, which includes natural resource usage and waste generation.

ESG also considers the social impact and governance, tackling topics like diversity, inclusion and CSR.

How can we use data analytics to improve ESG reporting?

💡 For more information about ESG Reporting,

Now that you can estimate the impact on compliance and ESG scoring, you can expand the scope of re-engineering to the whole distribution chain.

We focused on transportation. What about warehousing?

Consider local initiatives

This article's example focused on reducing the CO2 emissions of your transportation network by optimizing flows and warehouse locations.

However, smaller local initiatives, such as managing waste and cutting consumable usage, can be implemented to reduce the footprint of warehouse operations.

An illustration of two pallets being handled in a warehouse setting. On the left, a stacked pallet is shown, and on the right, a similar pallet is placed on a wrapping machine platform. The pallets are stacked with uniform boxes, and a manual pallet jack is used for transport. The image illustrates warehouse processes such as pallet wrapping for securing goods before transportation. This highlights the optimization of warehousing operations to reduce waste and improve efficiency.
Reduce Wrapping Film Usage — (Image by Author)

Your digital twin can also help you estimate the impact of these small initiatives on the overall network (if applied to all warehouses).

  1. Calculate the current consumption using actual operational data
  2. Include these parameters in your digital twin
  3. After a Proof Of Concept estimate, the savings target
  4. Extrapolate the savings (per warehouse) to the whole network

All these initiatives align with the efforts of cost reduction and operational excellence.

You can adapt your continuous improvement methodologies to track consumables usage and find alternatives that will not impact your productivity.

For more details about it,

How would you like to share your results?

Even if the adoption of visualization tools like PowerBI has improved, people still rely on PowerPoint slides to communicate.

Automate PowerPoint Slide Creation with Python

Do you want to spend more time creating PowerPoint presentations than analyzing the results of this analysis?

I guess no!

Don’t worry.

I have a solution for you, which I shared in another article.

A flowchart representing the process of generating a report from warehouse data. It begins with five weeks of operational activity data stored in a warehouse database. SQL queries are used to extract the prepared order lines. Then, Python scripts process and calculate KPIs, which are inserted into a PowerPoint report. The final report includes activity data per week, such as warehouse workload and order profile.
Automate PowerPoint Slides Creation with Python — (Image by Author)

The idea was to create a completely automated workflow to extract and process data from multiple systems to generate graphs and comments that will be included in PowerPoint slides.

Two slides from a generated PowerPoint report. On the left, the slide shows a bar chart titled “Warehouse Workload (WEEK-1),” with an analysis indicating that 14,583 lines were prepared, with Sunday being the busiest day. On the right, the slide shows a bar chart titled “Order Profile,” displaying the split of orders by the number of lines per order, with 31,874 orders prepared. Both visuals were generated using Python.
Example of Automatically Generated PowerPoint Slides — (Image by Author)

The Python library Python-ppt provides complete freedom to create tailored slides that can be filled with automatically generated data.

You can adapt this workflow for our CO2 emissions reporting tool.

For more details, have a look at the article linked below 👇

If you prefer to watch, have a look at the video version of the article

About Me

Let’s connect on Linkedin and Twitter. I am a Supply Chain Engineer who uses data analytics to improve logistics operations and reduce costs.

For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.

If you are interested in Data Analytics and Supply Chain, look at my website.

💌 New articles straight in your inbox for free: Newsletter
📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet

References

[1] GHG Protocol corporate standard, Greenhouse Gas Protocol, Link

[2] French Environmental Agency Ademe, Bilan GES, Link

[3] GitHub Repository with source code and data, Samir Saci, Link

--

--