Evaluating the carbon footprint of a software platform hosted in the cloud

Benjamin DAVY
Dec 15, 2020 · 20 min read
Photo by Taylor Vick on Unsplash

This is a practical attempt to estimate CO2 emissions for our platform. In short, we discovered that it’s a fascinating but complex topic. The industry is at a very early stage and environmental impact isn’t consistently reported by providers today.

Read also: Estimating AWS EC2 Instances Power Consumption

In this article, we first describe our context (Chapter 1) and share simple steps that helped us to get a better grasp of the physical reality of our virtualized infrastructure (Chapter 2).

Then, we dig deeper into:

  • How to estimate the environmental impact of industrial activities and how existing methodologies apply to information technologies (Chapter 3)
  • Where is the research in cloud data centers’ energy consumption, and are there tools and solutions we can use to measure it (Chapter 4)
  • What it would take to build a proper estimation for our AWS infrastructure, based on available data (Chapter 5)

This is a long journey — Let’s start

1 — Our Stack

In the below illustration we list our main technology providers. We are focusing on the main components of our advertising service and grouped our providers into three categories:

We are excluding internal IT tools and services as well as direct value chain partners: demand-side platforms used by Brands and Agencies to program their campaigns and Publisher infrastructure where ads are displayed.

There is a lack of carbon emission reports for customers. This is starting to change with Microsoft announcing a Sustainability Calculator, a dashboard monitoring the emissions for its Azure customers.

We did receive reports from some of our providers but couldn’t really exploit them as-is. The reported activities were not equivalent and we were kind of comparing apples and oranges. Also, some major emission sources were not reported at all. Namely, the emissions from manufacturing the hardware involved in the service (we will discuss this in Chapter 3).

2 — First analysis with costs and resources

One of our focuses has always been to improve the overall efficiency of our platform to support our growth.

To make sure we are on track, we monitor the cost of our infrastructure according to meaningful business metrics. Specifically, we keep an eye on the infrastructure cost per ad impression (in blue):

Evolution of the infrastructure cost per impression (in blue) compared to the growing number of impressions (in grey) for the past three years 👏 to the Teads Engineering team

While optimizing computing resources certainly makes sense from a financial point of view, we wanted to know more about the environmental impact of our activities.

We are convinced that to be able to take action, we have to better understand our emissions and where they come from.

Knowing that AWS is our main provider, we tried to translate available billing reports into data that reflect the physical reality.

We analyzed our EC2 usage over time (vCPUs and RAM quantity per day). We used the EC2 Running Hours metric from the AWS Cost Explorer and instance data.

Teads EC2 resources usage over time (vCPUs and RAM quantity per day) — April 2020 to September 2020

This first step helped us realize the actual size of our AWS infrastructure.

👉 We have put together a file with AWS instance specifications if you want to try.

Using this we can now analyze how these resources scale compared to our activity:

vCPU footprint per impression (in blue) compared to the total number of impressions (in grey) — April 2020 to September 2020

We can see that the vCPU usage per impression is decreasing but not as much as the cost per impression on the same period, this can be explained by a growing usage of spot instances.

This demonstrates that we cannot rely on economic proxies because pricing models may distort the emission reality. Next, let’s review the existing methodologies to perform an environmental impact analysis.

3 — How to estimate the environmental impact of information systems

The below frameworks describe how to manage, report and verify greenhouse gas emissions and removals for an organization. Basically, they help to inventory all required activities and estimate their impact. We listed three of them (non-exhaustive):

  • ISO 14064 is the international standard and a reference when it comes to greenhouse gas (GHG) emission reporting. It was created to standardize practices and certify the results. It can build upon accounting methodologies below.
  • Bilan Carbone is the french accounting methodology, compliant with ISO 14064. This methodology was created by the French Agency for ecological transition (ADEME).
  • Greenhouse Gas Protocol was created by the World Resources Institute (WRI) and the World Business Council for Sustainable Development. The GHG Protocol works with NGOs and governments to build a credible and efficient GHG accounting methodology.

Even if they seem to use the same approach overall, there are subtle differences in the way they report emissions. For example, some emissions can be reported on a voluntary basis in the GHG Protocol whereas mandatory in other methodologies.

A quick comparison is available in the Bilan Carbone methodological guide ¹ (see p.30, in 🇫🇷). These differences explain why we have issues figuring out how to compare or aggregate impact assessments from different organizations.

Regardless of methodology, the first practical step is to define the scope of our study.

The organizational boundary of the assessment is in our case a software platform. Then, we need to define the operational boundary, meaning the emission sources associated with our platform.

The previously listed methodologies define three emission categories, also called Scopes, that group emissions across the value chain:

Overview of GHG Protocol scopes — Source GHG Protocol

Here are the definitions of Scope 1, 2, and 3 according to the GHG Protocol:

  • Scope 1 emissions are direct emissions from owned or controlled sources.
  • Scope 2 emissions are indirect emissions from the generation of purchased energy.
  • Scope 3 emissions are all indirect emissions (not included in scope 2) that occur in the value chain of the reporting company, including both upstream and downstream emissions.

The “Bilan Carbone” methodology states that our operational boundary must be exhaustive and reflect the physical reality. As a result, all emissions needed to run our service must be included in the assessment.

At Teads, all of our processing and networking infrastructure is externalized, hence falling under Scope 3, with Scope 1 & 2 being mostly Teads offices & in-house IT equipment. Limiting our estimation on Scope 1 & 2 would be a nonsense.

There is a really great paper from David Mytton ² that explains it clearly:

“If outsourced to the cloud, IT emissions previously accounted for under the Greenhouse Gas Protocol Scope 1 — emissions that are directly linked to the activities of an organization from sources that it owns and control — and Scope 2 — emissions from the generation of purchased energy — move to Scope 3, referring to all other emissions.”

Even if that might seem like common sense, Scope 3 isn’t always reported depending on carbon accounting legislations.

We must be careful with “Carbon-free” or “Carbon transparent” statements that are only considering parts of the emission spectrum. For example, it’s worth remembering that renewable energy power plants are only “carbon-free” during a part of their lifecycle.

On this topic, the Net Zero Initiative ³ reminds us about what global carbon neutrality is:

“Science defines global carbon neutrality as a balance between anthropogenic CO2 emissions and anthropogenic CO2 removals. Removing as much CO2 annually as the emissions that are produced is the only way to stop the build-up of CO2 in the atmosphere.”

As a result, it can be tempting to play with Scopes to achieve some kind of neutrality on a smaller perimeter. We have to keep that in mind when looking at third party reports.

Once we have defined our perimeters we need to collect data describing the activities we want to assess. This is where emission factors come into play.

An Emission Factor is a coefficient that allows to convert activity data into greenhouse gas emissions. It is aggregated as CO2 equivalent or CO2eq. Long story short: without emission factors, we cannot build a robust assessment.

Global Warming Potential (GWP) — Converting 1 kg of GHGs into X kg of CO2 equivalent — Source Clim’Foot European Project

Emission factors databases exist and contain information that can be used for calculations. Each emission factor is usually provided with an uncertainty parameter that helps gauge the quality of the data and if we can safely use it.

We can find free and commercial databases on the GHG Protocol website. Industrial activities are well covered with many domain-specific databases (energy, transport, packaging, etc.). That’s not exactly the same when it comes to IT and even more if that involves cloud computing.

If we have a look at the Base Carbone from the French ADEME agency, the database only contains one emission factor from 2014 regarding the embodied emissions from manufacturing a server: 600 kg CO2eq. We have no details of the machine’s specifications and the uncertainty value is quite high at 80%. So that’s not really helping and we would still have to calculate emissions for running this machine.

Some manufacturers like Dell communicate their product life cycle assessments. This can help but still has limitations that we will discuss later.

For a given emission source, the emission factor can vary depending on several parameters, including where the activity takes place. For electricity-intensive activities like running computing resources, the emission factor is tied to the energy mix used to generate the electricity.

This information is reported as grams of CO₂eq per kWh, 1 kWh being the energy consumed by a 1000 watt device during one hour. For example, on the 6th of November, the energy mix carbon intensity was ~70 gCO₂eq/kWh for the french grid.

We can find this information on the ElectricityMap, an open-source initiative that shows live carbon intensity emissions of electricity by country.

Overall we have found limited off the shelf data for the Tech industry. In fact, it seems we are still at very early stages in this field. Emission factors for such activities are mainly discussed in research papers. Few really precise use cases have been studied (sending an email, etc.) although these figures are regularly debated.

As previously stated, our infrastructure is distributed across several IaaS, PaaS, and SaaS providers. We are mostly paying for virtualized resources and capacity. The direct consequence is that we have no idea of the physical reality of these activities.

A recent study from The Shift Project ⁵ about digital sobriety (in 🇫🇷) gives an overview of how to assess an Information System's environmental impact. The below table summarizes the main emission sources and how they should be accounted for:

Information System Annual Environmental Report methodology — Source: Déployer la sobriété numérique, The Shift Project — October 2020

This gives us an actionable piece of methodology to move forward:

  • They recommend pragmatism when looking at unknown emission factors. Even if it’s not ideal we should try to estimate the correct order of magnitude of an emission even if we don’t have precise data.
  • As for the calculation itself, we should use local electricity emission factors for cloud infrastructure and add compensation measures separately when they exist.
  • We need to consider the whole lifecycle for the equipment we are using with at least embodied emissions (production phase) in addition to actual energy consumption (run phase).

What’s interesting in the above chart is that Scopes are mentioned, confirming the importance of Scope 3 emissions in an information system (in light blue).

4 — What is the state of research on cloud data centers energy consumption & measurement

Green IT, the movement investigating ways to make information technology sustainable, is relatively young. However, it is extremely fertile and there are multiple publications covering how to design efficient software systems to minimize their footprints such as these best practices ⁴ for websites and web app development.

However, actually measuring software impact seems to be still an unsolved challenge:

  • On Theodo’s blog, Cyrielle Willerval recently explored how to monitor a server’s energy consumption to optimize the impact of a web application. While it’s an interesting method to use when developing or refactoring services, it does not give us the global footprint of a service.
  • We can also list Carbonalyser, a browser extension that estimates the global footprint associated with internet browsing. In that case, the computed value is based on really high-level estimations.
  • While writing this article we came across Argos, a new initiative that intends to bridge the gaps and estimate the energy footprint of software at the system level (client, server, network, and database). The estimate is limited to Watt.hours for now.

Deploying measurement tools on a large scale infrastructure can be a problem. On the other hand, if we want to apply carbon analysis methodologies we face a lack of shared and accepted emission factors.

Several initiatives are on-going to mitigate that, among them:

  • NegaOctet, which is a French research program aiming to define a dedicated methodology and emission factors to evaluate Digital Services impacts. The results of this initiative are highly awaited.
  • Clim’Foot, which is a European research project that searches for the harmonization of carbon accounting practices at the European level. For now, it’s lacking emission factors for the Tech industry.

Next, let’s dig deeper, from high-level footprint calculation methodologies and studies, down to energy consumption measuring tools.

Focusing on cloud providers, we need to know more about where we should look to estimate their impact. This is perfectly described in another paper from David Mytton ⁶ where he explains why calculating emissions from public cloud computing workloads is so difficult today.

The below chart breaks down all the components involved in a cloud service:

Cloud services components — Source: Assessing the suitability of the Greenhouse Gas Protocol for calculation of emissions from public cloud computing workload, David Mytton, August 2020

This chart gives us a detailed map of emission sources and the required data to perform an assessment. This is a great starting point. Unfortunately, the only data that is communicated by vendors is the average PUE for their data centers.

PUE, or Power Usage Effectiveness, is the industry-preferred metric for measuring infrastructure energy efficiency of data centers. It’s the total annual energy entering the data center building divided by the annual energy consumed to operate devices of the IT room:

Apart from the PUE, we can only “guesstimate” the footprint of our infrastructure. Here is a non-exhaustive list of facts that makes this a complex task:

  • The physical location of the infrastructure isn’t precise. For example, AWS has a single region in Virginia but has 55 physical data centers in that geography. This forces us to use energy mix intensity aggregates on a regional level for our calculations.
  • Cloud providers often develop their own custom hardware for which we don’t have any specifications or lifecycle information.
  • We run virtual machines and do not precisely know the corresponding physical server specifications. It gets even harder for serverless services (marginal usage in our case, but still).
  • Each instance family and generation has a specific footprint.
  • VM allocation ⁷ has a significant impact on actual energy consumption.
  • Actual server energy consumption doesn’t scale linearly according to CPU load and requires modeling ⁸.

Precise calculation is at this point impossible to achieve without more data from our providers. The next question is, can we still work on a satisfying estimation? Let’s review some interesting research papers to find out.

According to research from David Guyon ⁹, below is the distribution of the energy consumed in a given data center at the building level, the IT room level, and the Physical Machines level. The data is aggregated from academic papers published respectively in 2012 and 2017.

Energy Footprint of Cloud Computing Systems — Source: Supporting energy-awareness for cloud users, David Guyon, January 2019

The first graph isn’t really relevant in our case as it doesn’t correspond to the hyperscale type of data centers we are using. With a PUE = 2, we are far from the 1.1 to 1.2 figures communicated by our providers.

However, the second graph gives us interesting data regarding the consumption distribution across IT devices, with network devices representing 30% of the total energy consumption in the IT room.

The networking impact (data transfer) outside of the data center has been estimated in several studies. Joshua Aslan et al. ¹⁰ compared existing studies that modeled: “representative estimates for the average electricity intensity of fixed-line Internet transmission networks (from 2000 to 2015)”.

The most recent estimate for the year 2015 is from Jens Malmodin et al. ¹¹ at 0.023 kWh/GB for the IP core network only, excluding data centers and user devices.

Looking at the available publications, we have identified two approaches to estimate energy consumption for servers.

This first option relies on publicly available lifecycle analysis from hardware vendors. This is what the EcoDiag calculator does by giving CO2 equivalent emissions from using IT hardware. EcoDiag includes data for common physical hardware and can be used for on-premise infrastructure analysis.

Following the same approach, we can try to find a physical server that is comparable to an existing instance and use it as a benchmark to estimate impacts for both production and run phases.

According to its life cycle analysis, a standard 1U rack server like the Dell PowerEdge R640, consumes 1760.3 kWh/year and has a manufacturing climate impact of 320 kg CO2eq/year (assuming a four-year life span). Its exact configuration isn’t precisely detailed in the lifecycle sheet, especially the processor type which has a significant impact on the overall consumption.

We can only assume the analysis was done on the lower end configuration including two Intel Xeon Silver 4208. This would give us an equivalent of 2*8 physical cores (16 threads each) running at 2,10 GHz.

On AWS we are using virtualized instances. We can have a look at EC2 instance specifications that detail the amount of RAM, storage and for CPU AWS uses the vCPU notation. As stated in the documentation: “Each vCPU is a thread of a CPU core, except for T2 instances and instances powered by AWS Graviton2 processors.

In our example, that would make our Dell R640 a 32 vCPUs equivalent instance although it’s hard to identify a comparable machine in AWS’ offering.

While using vendor lifecycle data can help for our own equipment, it’s not robust enough to apply when it comes to cloud infrastructure. If we consider all potential error sources on server specifications our approximation is likely to be wrong by a great factor.

Given the variety of VMs at our disposal, another option is to base our estimation on the few pieces of information we have. Apart from memory and storage sizes, AWS communicates on the physical CPU models used in their instances. It’s also possible to identify it when running an instance.

This is important as some new instance generations rely on Arm-based processors that AWS is developing internally (Graviton2) and actual energy consumption differs greatly compared to x86 architectures.

CPU vendors provide thermal design power (TDP) for their products. Recent research from Henderson et al. ¹² commented on the use of GPU’s TDP to roughly estimate the carbon footprint of machine learning activities. It’s a starting point but it’s not precise enough, the TDP definition isn’t standardized and varies across vendors.

What’s more interesting in Henderson’s paper is the study of Intel RAPL readings. RAPL or Running Average Power Limit is a software power meter available on Intel Sandy Bridge architecture or higher. Research from Fahad et al. ¹³ has demonstrated a good correlation between RAPL readings and system power meters. This feature has even been experimented on EC2 instances by Nizam Khan et al. ¹⁴ with success.

RAPL has evolved over the years and CPU generations. The most recent processors also feature metering for other “power domains” like DRAM and Psys. The latter includes a broader set of components like integrated graphics chips:

RAPL Power Domains — Source: RAPL in Action: Experiences in Using RAPL for Power Measurements, Nizam Khan et al.

Limitations:

  • This technique only covers the run phase and we still have to estimate emissions from manufacturing.
  • RAPL readings might not be reliable to profile a VM as it reads the consumption at the processor level and not at the core or thread-level (vCPU). The same physical resources are shared between different users in a virtualized cloud environment and there is an expected impact of other co-running user instances on the overall power consumption and load of the system. Having co-running users could be seen as an issue to precisely determine precise software footprint but in our case, it’s simply a direct impact from running in the cloud. We can accept this limitation as part of the physical reality of our infrastructure.
  • This approach is CPU-centric and needs to be extrapolated to the overall system to be used in a carbon footprint analysis.

If we have no other option, we can still try to use RAPL readings on bare metal instances to build energy consumption profiles for EC2 resources. Emerging projects like Scaphandre & CodeCarbon rely on RAPL and can help to do this.

5 — What it would take to build an estimation for our AWS infrastructure

In this section, we detail a way to estimate the impact for the bulk of our AWS infrastructure, based on the available data and research. Our best guess is to use as much data as possible to have an accurate baseline before applying estimation factors.

To do that we need to cover the following emission sources:

  • Emissions from running compute primitives
  • Embodied emissions from the compute hardware manufacturing phase
  • Emissions from running and manufacturing network primitives
  • Emissions from storage primitives

This is a work in progress approach that can be discussed and hopefully made irrelevant once providers are able to provide us with actionable figures. Limitations are detailed for each assumption.

The proposed formula for this emission source is the following:

EC2 running hours * Instance Ratio * Physical Instance Energy Consumption (kWh) * Region Emission Factors (CO2kg/kWh) * PUE

Below are the assumptions we are making for each metric:

EC2 running hours: Using the AWS Cost Explorer, we do an export for each region to be able to apply local energy mix values. We select all EC2 Running Hours — Usage Type Groups to include the bulk of our infrastructure.

Limitations:

  • Some of this data is reported as No Instance type, according to Cost Explorer’s documentation, “This category includes costs (e.g., data transfer in/out) that are not directly attributable to a specific Instance Type”. We can assume this would be covered in our calculation for network primitives.
  • Lambda (AWS serverless compute service) and maybe other marginal services are missing from this report (not significant for us).

PUE: Here we use the average communicated by AWS (PUE 1.2), although it might differ depending on regions.

Limitation: This is an average value and we don’t know the actual PUE range.

Instance Ratio: We are trying to find a way to approximate VM consumption based on data from physical servers. So here we assume that, for an average load, an instance type using a portion of the hardware will consume the same portion of power. The vCPU metric is the easiest way to compare instances so our Instance Ratio would be:

Instance type vCPU number / max instance family vCPU number

Limitation: some instance families are optimized for computation or memory and a ratio based on vCPU might not be the most relevant. Instance configurations across a family are not always linearly bound to the vCPU and we could integrate the RAM information to that ratio.

Physical Instance Energy Consumption (kWh): Without more data from providers, we can use data from Intel processors (RAPL) at an average load. The average load of our service can be approximated using Cloudwatch metrics.

We can try to profile the consumption of metal or max instances and use this value as a baseline multiplied by the instance ratio. We can also compare this with CPU manufacturer data (TDP) or the SPECpower database.

Then, to convert Intel RAPL readings to a whole system consumption we can use existing estimates. According to David Guyon’s work, CPU + Memory consumption accounts for 55% of a physical server consumption (43% and 12% respectively).

Update: We have since studied the RAPL approach and published our results:

Region Emission Factors: We can use average values from official data producers, with the only limitation that, in some cases, we have a 1-year delay. Here are some figures for the countries where our infrastructure and main engineering offices are located:

  • Virginia US eGRID: ~0.335 kgCO2/kWh in 2018 (reported as 739.35 lbCO2/MWh), using the “Virginia” state data on the service
  • Ireland SEAI: 0.375 kgCO2/kWh in 2018
  • Tokyo Bureau of Government: 0.470 kgCO2/kWh in 2017
  • France eco2mix: 0.035 kgCO2/kWh in 2019, it’s interesting to see that we can greatly optimize our infrastructure impact by locating it next to low carbon grids.

The ElectricityMap service can be an interesting alternative for live data.

Emission compensation: It’s recommended to calculate emissions using local energy mix intensity values and then integrate compensation measures (offsets and renewable energy certificates). We would need more data from our providers to know this precisely.

We can use standard values for both embodied emissions life span in years: 320 kg CO2eq/year for a standard physical 1U server, assuming a four-year life span. This number needs to be multiplied by both the instance ratio and actual usage (running hours/8760).

Our assumption is that we can use the value from David Guyon’s work with network equipment accounting for ~30% of a data center IT room footprint. We can also use the 0.023 kWh/GB figure to encompass data transfer.

We still have to look for usable values in kWh/GB of storage.

Takeaways

  • We didn’t think we would have to go this far to get meaningful numbers. But the lack of readily usable and accepted emission factors makes it quite complex to estimate the carbon footprint of software platforms.
  • Things are improving for the best and we are starting to receive some data from our providers but methodologies are not fully disclosed which makes it difficult to compare and aggregate. We are lacking true customer reporting standards.
  • We cannot use costs as a proxy since we are billed according to the usage of virtual resources, without considering load and the energy consumption impact. As a result, an idle resource costs the same as an instance running 100% CPU even if their respective impact may largely differ. Pricing models may also distort the emission reality (on-demand resources versus spot resources, EMR markup, etc.).
  • Finally, there is a need for more transparency on life cycle analyses that are produced so that the community can benefit from these efforts. Having a set of consumption profiles in kWh with good confidence intervals for infrastructure primitives would be a game-changer. It would help in performing life cycle analysis and taking educated decisions when it comes to software architecture and infrastructure location.

The next step for us will be to apply the calculation proposed in Part 5. It will also be interesting to experiment with the identified new measurement initiatives to build the emission factors we are lacking today.

🙏 Thank you for reading! If you’ve made it this far, please do not hesitate to give us some feedback.

Special thanks to Eric Pantera for his help and support in writing this article.

Bibliography

  1. Bilan Carbone methodological guide v8, Association Bilan Carbone — 2017
  2. Hiding greenhouse gas emissions in the cloud, David Mytton, Nature — July 2020
  3. Net Zero Initiative — A framework for collective carbon neutrality, Carbon 4 — April 2020
  4. Ecoconception Web : les 115 bonnes pratiques, Frédéric Bordage, GreenIT.fr — April 2019
  5. Déployer la sobriété numérique, The Shift Project — October 2020
  6. Assessing the suitability of the Greenhouse Gas Protocol for calculation of emissions from public cloud computing workloads, David Mytton, personal blog — August 9, 2020 — A state of the art of the available data for Cloud customers to calculate their emissions.
  7. An experiment-driven energy consumption model for virtual machine management systems, Mar Callau-Zori et al., INRIA — 2018 — A study illustrating how VM allocation on physical hosts can impact energy consumption.
  8. Energy Measurement and Modeling in High-Performance Computing with Intel’s RAPL, Nizam Khan et al. — 2018
  9. Supporting energy-awareness for cloud users, David Guyon, INRIA Myriads — 2018 — PhD Thesis covering the energy footprint of Cloud Computing systems.
  10. Electricity Intensity of Internet Data Transmission, Joshua Aslan et al., Center for Environmental Strategy, University of Surrey — 2018
  11. The energy and carbon footprint of the ICT and E&M sector in Sweden 1990–2015 and beyond, Jens Malmodin et al., Ericsson Research — 2016
  12. Towards The Systematic Reporting Of The Energy And Carbon Footprints Of Machine Learning, Henderson et al., Stanford — 2020
  13. A Comparative Study of Methods for Measurement of Energy of Computing, Fahad et al., School of Computer Science, UCD — June 2019
  14. RAPL in Action: Experiences in Using RAPL for Power Measurements, Nizam Khan et al. — 2018

Teads Engineering

150+ innovators building the future of digital advertising

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store