Measuring the Carbon Footprint of Machine Learning: Krallmann’s Carbon Monitor

2 min readJan 16, 2023

This is a blog post-version of the recent live talk of Tibebu Biru in the clean-IT openXchange series. In his live talk, Biru introduces the Carbon Monitor for Machine Learning (CMML) The recording of the live talk can be found in the clean-IT Forum on openHPI.

Introduction

Krallmann AG is passionate about its cleanAI initiatives, and as part of its cleanIT curriculum, it's developing a carbon monitor for machine learning.

In this blog post, we will discuss the importance of measuring the carbon footprint for machine learning, and the features Krallmann’s carbon monitor will have. We will also discuss the three options there are for carbon footprint estimation.

Why Measure Carbon Footprints for Machine Learning?

The use of machine learning has led to many great results and benefits, but it also requires a large amount of energy to process data and train models. In order to ensure that machine learning initiatives are environmentally sustainable, it is essential to measure and track the carbon footprint of machine learning models. Such measurements would allow comparing different models in terms of their carbon footprints and identify optimization opportunities.

Features of Krallmann’s Carbon Monitor

Figure 1: Carbon impact of fine-tuning a BERT model for code classification

Krallmann’s carbon monitor will be Python-based and easy to integrate into existing code bases. It will also be extensible and flexible, allowing users to choose the power measurement tools they want to use, and measure the power usage of the hardware in real-time. Additionally, the carbon monitor will be able to track energy consumption over time.

A similar open-source library that is already available is called CodeCarbon (https://github.com/mlco2/codecarbon).

Carbon Footprint Estimation

The carbon footprint estimation of Krallmann’s Carbon Monitor consists of three parts:

global energy mix data
country emissions data
cloud emissions data

The global energy mix data and country emissions data will be used to estimate the carbon footprint from hardware energy consumption.

The cloud emissions data will be used to estimate the cloud emissions of cloud services, such as Google Cloud Platform, Amazon, and Microsoft Azure.

The power usage effectiveness (PUE), which describes the energy efficiency of a data center, will be used in the estimation process.

Conclusion

Krallmann’s carbon monitor for machine learning is an important part of their cleanAI initiative. By measuring and tracking carbon footprints for machine learning, they will be able to compare AI models in terms of their carbon footprint and identify optimization opportunities. The carbon monitor will be Python-based and will have various features, such as the ability to track energy consumption over time. Additionally, various data points will be used to estimate the carbon emissions from the measured energy consumption.