TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial…

What Greenwashing Is, and How We Can Use Analytics to Detect It

Samir Saci
TDS Archive
Published in
9 min readAug 10, 2023

--

The illustration depicts a decision flowchart showing how data analytics can help identify greenwashing. The flow starts with the question, “Is it backed by real data?” If the answer is “Yes,” the flow leads to “It’s trustworthy,” indicated by a green “YES.” If the answer is “No,” the flow leads to “It’s greenwashing,” indicated by a red “NO.” This visual highlights the importance of data validation in distinguishing genuine sustainability efforts from deceptive marketing.
Use Data to Detect Green Washing — (Image by Author)

Greenwashing is the practice of making misleading claims about the environmental benefits of a product or a service to communicate a false image of sustainability.

How can we use analytics help the world fight greenwashing?

This act of embellishment or hiding falsehood has become a common challenge as companies seek the attention of environmentally conscious consumers.

The graphic displays five “sins” of greenwashing, represented by icons and corresponding labels: “Lies” (a figure with a lying gesture), “Vagueness” (a question mark over a dollar symbol), “Proof-less” (a magnifying glass over a document), “Irrelevance” (a hand holding an unrelated puzzle piece), and “Trade-off” (a hand exchanging money for goods). These elements illustrate common misleading practices companies use to falsely appear environmentally responsible.
Five sins of greenwashing — (Image by Author)

In this article, we will delve into greenwashing to explain its manifestations.

We will use case studies to show how to use data analytics to detect and prevent these unethical practices.

Summary
I. Understanding Greenwashing
1. What is Greenwashing?
2. Examples of Greenwashing
3. Greenwashing x Data Analytics
II. Data Analytics for Greenwashing Detection
1. The difficult task of detection
2. Natural Language Processing (NLP)
3. Change Point Analysis
4. Regression Analysis
5. Network Analysis
III. Conclusion

Understanding Greenwashing

I discovered Greenwashing when I conducted my first supply chain sustainability project.

As a supply chain solution manager, my task was to estimate the environmental footprint of our customers' logistics operations.

How can a company selling disposable plastic products can claim to be carbon neutral?

It was surprising to see the claims of some of their competitors, considering that they were producing and selling similar products.

This article aims to show you how analytics tools can help you detect this kind of false claims.

What is Greenwashing?

Greenwashing is a portmanteau word of ‘green’ and ‘whitewashing’.

Organizations use this dishonest practice to create a false impression of environmental responsibility.

The objective is to capitalize on customers' and investors' growing demand for eco-friendly products.

The most common forms of greenwashing include,

  • Vagueness: undefined terms such as ‘eco-friendly’ or ‘all-natural’ used without clear definitions or evidence.
    For example, a company labels a product as ‘100% natural’ without disclosing that the natural materials were unsustainably sourced.
  • Irrelevance: highlighting an eco-friendly attribute that is either unimportant or unrelated to the product’s environmental impact.
    For example, a company emphasises that its product is ‘CFC-free’ while Chlorofluorocarbons have been banned for decades.
  • Hidden trade-offs: promoting one environmentally friendly aspect of a product while ignoring other significant impacts
    For example, a paper company promotes its use of recycled paper without mentioning energy consumption and carbon emissions for production and logistics.
The graphic displays five “sins” of greenwashing, represented by icons and corresponding labels: “Lies” (a figure with a lying gesture), “Vagueness” (a question mark over a dollar symbol), “Proof-less” (a magnifying glass over a document), “Irrelevance” (a hand holding an unrelated puzzle piece), and “Trade-off” (a hand exchanging money for goods). These elements illustrate common misleading practices companies use to falsely appear environmentally responsible.
What is greenwashing? — (Image by Author)

When you see an advertisement for a naturally sourced recycled t-shirt, consider:

  • The amount of energy, electricity and water used to source these “natural raw materials.”
  • The additional CO2 emissions and waste generated by the recycling process

With life cycle assessment (LCA), you have a data-driven method to evaluate these impacts by considering the entire product life cycle and avoiding this trap.

A flowchart illustrating the life cycle of a fast-fashion retail product. It begins with raw material extraction (represented by a cloud), followed by production in a factory, transportation by ship, storage in a warehouse, delivery by truck, and sale in a retail store. The cycle ends with product disposal. The chart represents the entire life cycle of a product from production to disposal, as part of a life cycle assessment (LCA) to evaluate environmental impacts.
Life Cycle Assessment of your ‘100% natural’ recycled T-shirt — (Image by Author)

The idea is to estimate the environmental impact of sourcing, producing and using a specific product or service.

This requires collecting and processing data from multiple sources using Business Intelligence tools.

💡 For more details,

Let us analyze some real examples.

Examples of Greenwashing

Several high-profile cases have brought greenwashing into the spotlight.

  • A large car manufacturer was found guilty of using software in their vehicles to cheat emissions tests on cars marketed as “environmentally friendly”.
  • A famous water company advertised its product as “carbon-negative” without acknowledging the environmental cost of transporting bottled water from Fiji Island to global markets.

The second one can be easily debunked using basic supply chain analytics and publicly available data.

World map showing a supply chain route from factories in Asia to retail stores in North America and Europe. Two factory icons in Asia with arrows connecting to store icons in the US and Europe, symbolizing product transportation routes. The red arrow from Asia to the US indicates a longer, less sustainable route, while the green arrow from Asia to Europe suggests a shorter, more sustainable route. The image illustrates the global environmental cost of long-distance supply chains.
Supply Chain Flows Analysis — (Image by Author)

How?

  1. Estimate sales volumes by market using the financial reports
  2. Calculate the emissions per bottle using the GHG protocol for the transportation from manufacturing plants to markets
  3. Compare the results with the figures published by the company

💡 For more details on how to use analytics to estimate CO2 emissions

Beyond reporting, how can data analytics help us detect this kind of fraud?

Greenwashing x Data Analytics

Understanding the various forms and implications of greenwashing is crucial to implementing proactive measures to tackle this problem.

While regulatory bodies and conscious consumers play a significant role in this fight, data analytics can be an additional boost to automate fraud detection.

Diagram illustrating the decision process to detect greenwashing using data analytics. It shows a series of interconnected icons, representing various sustainability metrics and analytics tools, such as data analysis, machine learning (NLP), and environmental impact assessment. On the left, ‘YES’ leads to successful detection of greenwashing, and on the right, ‘NO’ leads to questions about sustainability claims. The flow of analysis highlights how data can be used to verify or debunk fakes.
Green Washing Detection using Data Analytics — (Image by Author)

The idea is to use…

  • Publicly available data: financial and sustainability reports, footprint databases, social media
  • Advanced analytics models, including NLP, forecasting or statistical models to detect fraud

The following sections will explore how to use these tools to promote a more transparent and sustainable corporate world.

🏫 Discover 70+ case studies using data analytics for supply chain sustainability🌳and business optimization 🏪 in this: Cheat Sheet

Data Analytics for Greenwashing Detection

The difficult task of detection

Identifying greenwashing is a complex task, given the complexity of its manifestation and the overwhelming volume of information available.

Data analytics can provide powerful tools for filtering large datasets, identifying patterns and anomalies, and extracting valuable insights.

Data Analytics for GreenWashing Detection — (Image by Author)

In the following sections, we will explore how to use these solutions using an example of potential fraud.

Let us start with text analysis.

Natural Language Processing (NLP)

A primary application of NLP in greenwashing detection is sentiment analysis.

Let us consider the example of major oil companies.

They regularly publish sustainability reports and press releases highlighting their commitment to environmental protection.

The data at our disposal consists of these pdf documents on their websites.

A visual comparing the financial growth of a company and its CO2 emissions. The financial reports from 2018 to 2020 show an increase in profitability, but CO2 emissions follow a similar upward trajectory, with a sharp rise in 2020 by 200k tons CO2eq. This reveals a contradiction between financial success and environmental sustainability, highlighting potential greenwashing.
Sentiment Analysis vs. CO2 emissions from reports — (Image by Author)

An NLP sentiment analysis model can evaluate the sentiment behind these statements.

💡 How to detect greenwashing?
If the statements carry overly optimistic sentiments not reflected in the actual environmental performance metrics, it could be a sign of greenwashing.

For instance, in the example above

  • Total CO2 emissions exploded in 2020: +26k Tons of CO2eq
  • However, the sentiment score kept increasing

There is a contradiction between the actual sustainability performance and the narrative sold in the report.

What about sustainability indicators? We can use their trends.

Change Point Analysis

Change point analysis identifies the points in a data sequence where the statistical properties change.

For instance, a major automobile manufacturer reports a sudden decrease in CO2 emissions.

A graph showing CO2 emissions over time, with a sudden drop detected around 2020. The sharp reduction indicates a possible statistical anomaly, which could suggest greenwashing if not backed by consistent, long-term efforts to reduce emissions. The chart uses a change point analysis method to identify when emissions behavior significantly changed.
Example of a potential anomaly in the correlation of manufacturing outputs vs. emission — (Image by Author)

The available data would include a time series of the company’s reported emissions and manufacturing outputs.

💡 How to detect greenwashing?
Change point analysis can detect if these reductions correspond to

  • Legitimate and continuous sustainability efforts
  • Temporary this may be suggesting greenwashing

I have used a CO2 emissions dummy data set and applied the Python library ruptures:

A line chart split into two sections, one representing the period of positive sentiments in sustainability reports and the other showing a drop in actual performance metrics such as CO2 emissions. The increasing sentiment score contradicts a sharp rise in emissions, indicating potential greenwashing through optimistic yet unsupported sustainability claims illustrating change point analysis used to detect greenwashing.
Change Point Detection Example — (Image by Author)

It has detected a major change in the 9th year, which we should investigate.

This is an initial assessment, the reduction may be due to the impact of actual initiatives.

And you can verify it in the detailed initiatives shared in the sustainability report.

💡 Check the code to get this visual,

Have you heard about correlation?

Regression Analysis

Regression analysis can help establish relationships between different variables.

For instance, a major fashion brand reports sustainability expenditure (Euros) and waste production levels (Tons).

💡 How to detect greenwashing?
A regression model can identify whether increases in sustainability expenditure are leading to proportionate decreases in waste production.

If not, this could be an indication of greenwashing, and this requires advanced investigations.

  • This is not a univariate problem, as waste can be impacted by many other parameters (product design, raw materials, …)
  • A product-centred approach (LCA) is preferred to track how expenditures affect the supply chain's environmental footprint.

Can we connect companies to polluting suppliers?

Network Analysis

Network analysis helps to understand relationships between entities in a network.

A company in the electronics sector might claim that its products are sourced from sustainable and ethical suppliers.

💡 How to detect greenwashing?
The data here would include the company’s supplier network and third-party reports on supplier practices.

This image is a graph visualization of transportation routes and stores clustered by regions. Different colors indicate stores grouped by province: green for Jiangsu, red for Anhui, and black for Zhejiang. The graph shows connections between stores, indicating common delivery routes. The circled areas with dashed lines represent key clusters of stores, and solid arrows point to corresponding store groups in the legend, helping visualize how the network can be optimized for regional deliveries.
Network Analysis using Python’s Networkx — (Image by Author)

Using network analysis, we can scrutinize the suppliers' sustainability KPIs (ESG scores, for instance) and their connections.

If nodes in the network have questionable sustainability practices, this could imply potential greenwashing.

💡 For more details on how to implement a network graph with Python

This gives you an initial understanding of using advanced analytics to detect greenwashing and fraud automatically using publicly available data.

Conclusion

The End of Greenwashing?

As we consider future ESG regulations, the connection between greenwashing and data analytics is set to deepen significantly.

This image shows a high-level overview of ESG (Environmental, Social, Governance) reporting. Three sections are represented: the environmental section shows an icon representing the planet, the social section with an icon representing people, and the governance section with an icon of a building representing corporate governance. The ESG ($) indicator is shown in the middle, implying financial significance or considerations in relation to ESG factors.
Example of Reporting Categories — (Image by Author)

This non-financial report, used by organisations, communicates their environmental performance (E), social responsibility (S), and governance structures' strength (G) to stakeholders and financial organizations.

Why is it important?

As customer and investor awareness of sustainability grows, corporations will find it risky to hide behind vague or misleading sustainability claims.

Therefore, greenwashing will face significant challenges in an increasingly data-driven world.

💡 For more details about ESG reporting,

Can we use data to actually support the green transformation?

Data Analytics for Footprint Reduction

Instead of building false claims, companies can use advanced analytics to design and implement initiatives that will provide concrete results.

For instance, Sustainable Supply Chain Optimization is a data-driven approach combining cost reductions and footprint reduction.

World map displaying sustainable supply chain optimization, comparing cost efficiency and sustainability. The map highlights monthly demand by country, supply capacity at low and high-capacity manufacturing sites, and sustainability metrics, including CO2 emissions, water usage, and energy consumption for each production country. The image emphasizes the need to meet demand at the lowest cost while limiting environmental impact through strategic site selection.
Sustainable Supply Chain Optimization — (Image by Author)

Let’s imagine that your company is producing and selling items worldwide.

Where to locate your factories and distribution centers?

This is an optimization model considering,

  • Demand for each market location in (Units/Month)
  • All the potential manufacturing sites with their production costs, environmental footprint (CO2, resources), ESG scores
  • Constraints on environmental footprint per unit, social and governance scores

What is the most sustainable (and economically viable ) combination?

  • To minimize costs if you want to focus on profitability
    Can we respect environmental targets?
  • To minimize CO2 emissions if you want to focus on sustainability
    Can we keep our profitability level?

💡 For more information about this application,

💡 Follow me on Medium for more articles related to 🏭 Supply Chain Analytics, 🌳 Sustainability and 🕜 Productivity.

About Me

Let’s connect on Linkedin and Twitter. I am a Supply Chain Engineer who uses data analytics to improve logistics operations and reduce costs.

For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.

If you are interested in Data Analytics and Supply Chain, look at my website.

💌 New articles straight in your inbox for free: Newsletter
📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet

--

--

TDS Archive
TDS Archive
Samir Saci
Samir Saci