The Real Cost of AI

And we are just in the beginning

Sid
4 min readDec 19, 2023

ChatGPT generated over 1.7 billion visits in October 2023, just a year after its release, and its astronomical growth rate, combined with the introduction of competitors such as Google Bard and Amazon Q, will cause AI usage as a whole to accelerate.

Source: Our World in Data

Furthermore, GPU producers such as Nvidia are expected to sell over half a million GPUs for artificial intelligence usage alone in 2023.

However, the use of AI does not come without its drawbacks.

A PhD candidate at Vrije Universiteit Amsterdam predicted that AI servers could consume around 85 tWh of energy annually — to put that into perspective, that is comparable to the annual consumption of electricity by Finland.

And that’s just a middle ground scenario. In his peer reviewed study, Alex de Vries estimates that global energy consumption by servers could reach up to 134 tWh annually — a demand similar to Argentina or the Netherlands.

The consumption of energy has only been amplified with the gradual transition from narrow (focused on a single task) to general (can perform a broad range of tasks) AI.

Source: Hugging Face / Carnegie Mellon University

Furthermore, the more complicated a request, the longer servers work to fulfill it, and the more power is consumed.

Source: Carnegie Mellon University

On average, 1000 queries from the average image generation model consumes as much energy as driving a car 15 km.

In an interview with Verge, De Vries stated that

“if you’re going to be expending a lot of resources and setting up these really large models and trying them for some time, that’s going to be a potential big waste of power.”

It isn’t just the usage of AI models that requires terawatts — cooling servers can increase energy demand by 50% (as a worst-case scenario).

Servers generate a lot of heat that cannot be removed by simply installing heatsinks — they require active cooling methods, such as computer room air conditioning (CRAC).

CRAC units, while adequately performing their function, are also extremely inefficient and have a slow response time, causing uncontrolled energy consumption — and this is a problem shared by a large percentage of current server cooling methods.

Change is also happening at a relatively slow rate compared to the increase in energy demands. Furthermore, firms often put off upgrading their cooling infrastructure, choosing to delay the expensive initial investment at the cost of higher energy consumption and bills.

The rate at which AI is developing is also a cause for concern when it comes to its energy consumption. OpenAI started the AI craze just over a year ago by releasing ChatGPT; today’s most recent models make it look like nothing more than a hobbyist’s side projects.

Gemini, released just a few weeks ago, is Google’s flagship LLM model, projected to be integrated in almost all of their products.

When compared to GPT-4, Google DeepMind CEO Demis Hassabis described Gemini as “substantially ahead”, outperforming its competition in 30 of 32 benchmark tests.

Source: Google

Due to its nature as a general model, Gemini interacts with inputs of all forms, including text, images, video, audio, code, and far more.

However, due to its inherent multimodality, Gemini’s energy consumption is expected to be a cause for concern — even though Google has stated that they will attempt to optimize its energy efficiency.

Initially training the data with its corpus of media of different formats (could be as large as 50 petabytes) and continuously retraining the data to account for real-time information requires substantial amounts of energy.

If that wasn’t enough, each request could use up to 300 joules of energy — a number which adds up considering the expected tens of millions of daily requests.

Note: Corpus size and energy consumption stats are assumptions based on similar models (LaMDA, PaLM, and WuDao 2.0).

While many firms are transitioning to more efficient server architecture and energy generation, there are 2 main issues.

  1. The rate at which AI — and its energy demands — increase is far greater than the rate at which advancements in server architecture reduce energy demand.
  2. Switching to renewable sources of energy won’t solve the problem. The power output of wind farms, solar panels, etc. is unreliable or too low, and more reliable methods that generate enough power e.g. geothermal have a high start up cost.

However, this may not necessarily be the case, with researchers Desislavov, Martínez-Plumed, and Hernández-Orallo stating that

“[Even though] the growth is still clearly exponential for the models with high accuracy, for a sustained increase in performance we see a much softer growth in energy consumption than previously anticipated”.

--

--