Nashville, United States photo by — @tannerboriack

The Lack of Data on Data Centres

Alex Moltzau
DataSeries
9 min readMar 9, 2020

--

Why is it so hard to find reliable data on data storage and emissions?

When looking for numbers on the data centre industry it is hard to understand where to start if you are unfamiliar with the industry. Data centres has increasingly been titulated by large magazines as ‘dirty secret’(s) such as was done by Fortune in 2019 by Naomi Xu Elegant in her article titled The Internet Cloud Has a Dirty Secret. Yet should we be surprised? Is it not strange that we expect our data to disappear into the ‘cloud’ — as a metaphysical concept we seem to conceive of this as immaterial, when of course it is not.

“Data centers contribute 0.3% to global carbon emissions, according to Nature; the ICT sector as a whole contributes over 2%, and those numbers could increase. The U.S. is home to 3 million data centers, or roughly one for every 100 Americans.” (Elegant, 2019)

Less elegant was the way data centres were portrayed by Film and TV writer Beth Webb in her recent BBC documentary, Dirty Streaming: The Internet’s Big Secret. The documentary launched on Thursday 5 March 2020. It became such a big event that an industry online magazine saw it fit to respond to the accusation contesting the statements made by the documentary. Webb described a vast network of energy-guzzling data centres and undersea cables (BBC, 2020).

The question will be what did the industry magazine contest?

I will attempt to list the following claims by the data centre industry:

  • “…the assertion that data centres depend on fossil fuels. In fact, the ICT sector is world-leading in its adoption of renewables.”
  • “Greenpeace has been monitoring this closely in its clickclean campaign. In the UK 76.5% of the electricity purchased by our commercial data centre operators is 100% certified renewable, and a further 10% is purchased according to customer requirement, which increasingly means renewable, taking that total up.”
  • Fryer added that Google is the world’s largest purchaser of renewable power, meeting its requirements through power purchase agreements
  • “At a UK-level, our energy data from the climate change agreement shows an incremental increase in energy consumption from 2.57 TWh to 2.89TWh between 2016 and 2018. “
  • “Infrastructure efficiency has improved by 16% since 2014. At the same time improvements in hardware, in software and in utilisation have massively increased productivity. The energy needed to process a given amount of data has reduced by around seven orders of magnitude over the last three decades.”
  • “The suggestion that data centres are secretive about power use is also inaccurate. Consolidating IT activity into purpose-built facilities improves both transparency and efficiency,” added Fryer.”
  • “The UK commercial sector monitors and reports its energy consumption at the sector level. Energy consumption is measured, audited and publicly reported at regular intervals in our CCA reports.”

Apparently the industry is transparent. Notice the question is geared towards the local consumption in the United Kingdom. As written previously the consumption varies greatly according to region (Lacoste et al., 2019). As such yes there is a question of local responsibility, however the consumption patterns are seldom local. The claim that regional actors are taking responsibility is likely to help little if the emissions of the usage is exported or not tracked on a national level.

(Lacoste et al., 2019)

We could ask a few questions of this of course.

  1. Steven Fryer talks of UK level. Are there any numbers for the data flowing out from the United Kingdom?
  2. What percentage is this of the total amount of data flowing within the United Kingdom?
  3. If there is a percentage that is exported or processed outside of the United Kingdom how could we know?
  4. If we knew the energy exported and these numbers are calculated merely on a sector level, then how do we know the energy mix that was used in each separate case?

These are preliminary questions yet they give us an indication of the difficulty posed when claims are made. That is not to say that all data centres are responsible, a statement that is highly untrue — how could they be? There must at least be a form of variation between different data centres. Then again data centres may have to realise that they are operating in an extractive space — relying heavily on mining and transportation. This should not of course be put squarely on the shoulders of the data centre industry (or sector) it must lie with overall consumption patterns such as is mentioned to the counter-statement made by Fryer.

Then again are some companies subversive in their operations? If yes, then who?

One case would be the Amazon Atlas release on Wikileaks showing that the Amazon Inc data centres operating around the world often do covertly in many cases. It showed one hundred data centers spread across fifteen cities in nine countries. To accompany this document, WikiLeaks also created a map showing where Amazon’s data centers are located.

Amazon, which is the largest cloud provider, is notoriously secretive about the precise locations of its data centers. It seems that Amazon operates out of data centers owned by other companies with little indication that Amazon itself is based there too or runs its own data centers under less-identifiable subsidiaries. These have names such as VaData, Inc. In some cases, Amazon uses pseudonyms to obscure its presence (WikiLeaks, 2018).

For this reason one can at least say that some data centres operate in a secretive manner. When such compelling evidence is provided it would be rather hard to argue the contrary view of transparency in an industry when the largest cloud provider is known for this modus operandi. This habit of working is mired with secrecy and subversive methods ingrained in international networks. When it takes investigative journalism or a natural disaster to discover who the owner of a data centres is — well, it is very secretive indeed.

“While Amazon’s cloud is comprised of physical locations, indications of the existence of these places are primarily buried in government records or made visible only when cloud infrastructure fails due to natural disasters or other problems in the physical world.”

One could argue that this was due to Amazon’s bid for contracts within the Defence industry, yet much can be argued on behalf of security. An example from a state actor would be the data centre in the middle of the Utah desert by the National Security Agency as was revealed by Edward Snowden a while back.

“…frequent problems with power and water usage came out, the choice seemed even weirder and more arbitrary. Why put the infinite archive of state surveillance in a place so vulnerable to drought?”

If that is the case, then it might be a dirty secret — or maybe at the very least just a secret. Admitting that there are secrets seems appropriate, yet striving for transparency is still admirable, and the technology industry can do so.

There are reports from the large actors on data centres. Environmental reports from the data centre actors. When asking the question about data centre electricity and emissions an answer was given by George Kamiya who is George Kamiya, Strategic Initiatives Office, International Energy Agency, France.

He sent the following list for 2018 in estimates.

“Here are 2018 estimates for electricity and/or GHG:

Globally, our (IEA + Northwestern University) estimate is ~200 TWh for all data centres in 2018. We did not estimate GHG.

The most rigorous analysis in peer-reviewed literature is from Malmodin & Lunden (2018) 1, with a global estimate of 245 TWh (which includes enterprise networks, which the IEA estimate above excludes), emitting 160 Mt CO2e.

Note that none of the above values include cryptocurrency mining, which would probably add another 50 TWh in 2018 (~10–20 Mt CO2e).”

His big question was whether these account for LCA and production. Life cycle assessment (LCA) is a technique for assessing environmental loads of a product or a system.

This is often mentioned in the case of the manufactured product. For data centres this would stretch into a wide array of products used to construct the data centres towards the production and construction as well as operations of data centres. If one were to consider all the servers — the cooling elements, aerosols as well as a series of other aspects that would certainly be different than the status quo in some aspects (if understanding Kamiya correctly). One cannot of course hold Kamiya to account as this is a forum post from rather a few months back. Still it begs the question of possibility to improve if an actor should do an LCA from cradle to grave:

  1. Environmental impacts are assessed from raw material extraction and processing (cradle),
  2. through the product’s manufacture, distribution and use,
  3. to the recycling or final disposal of the materials composing it (grave)

As he mentioned this would not account for production either. Then again the large technology actors are attempting to increase the control they have over production yet in the process are noticing it is not a straightforward area to be operating within. In December 2019 an article appeared in The Guardian titled: Apple and Google named in US lawsuit over Congolese child cobalt mining deaths.

“Apple, Google, Dell, Microsoft and Tesla have been named as defendants in a lawsuit filed in Washington DC by human rights firm International Rights Advocates on behalf of 14 parents and children from the Democratic Republic of the Congo (DRC).” (Kelly, 2019)

Attempting to take responsibility because if we move beyond our screens there is a whole lot of responsibility within the technological infrastructure, life cycle and production that follows. Film and TV writer Beth Webb is correct in her statements to some degree and so is Fryer. Some actors in the data centre industry are attempting to take responsibility, yet we have to move beyond the national into the global trade network if we are to address this properly. We must begin too perhaps to consider how to reduce data use both from a private and industry perspective. Being as specific as possible, conserving energy, although it may be challenging in practice.

References:

BBC. (2020, March 5). Dirty streaming: The internet’s big secret. Retrieved March 9, 2020, from https://www.bbc.com/news/av/stories-51742336/dirty-streaming-the-internet-s-big-secret

Burrington, I. (2015, November 19). A Visit to the NSA’s Data Center in Utah. Retrieved March 9, 2020, from https://www.theatlantic.com/technology/archive/2015/11/a-visit-to-the-nsas-data-center-in-utah/416691/

Elegant, N. X. (2019, September 19). The Internet Cloud’s Dirty Secret: It Consumes Tons of Energy, Has Large Carbon Footprint. Retrieved March 9, 2020, from https://fortune.com/2019/09/18/internet-cloud-server-data-center-energy-consumption-renewable-coal/

Kelly, A. (2019, December 16). Apple and Google named in US lawsuit over Congolese child cobalt mining deaths. Retrieved March 9, 2020, from https://www.theguardian.com/global-development/2019/dec/16/apple-and-google-named-in-us-lawsuit-over-congolese-child-cobalt-mining-deaths

Lacoste, A., Luccioni, A., Schmidt, V., & Dandres, T. (2019). Quantifying the Carbon Emissions of Machine Learning. arXiv preprint arXiv:1910.09700.

WikiLeaks. (2018, October 11). Amazon Atlas. Retrieved March 9, 2020, from https://wikileaks.org/amazon-atlas/

This is #500daysofAI and you are reading article 279. I am writing one new article about or related to artificial intelligence every day for 500 days. My current focus for 100 days 200–300 is national and international strategies for artificial intelligence. I have decided to spend the last 25 days of my AI strategy writing to focus on the climate crisis.

--

--

Alex Moltzau
DataSeries

Policy Officer at the European AI Office in the European Commission. This is a personal Blog and not the views of the European Commission.