Code Green: Addressing the Environmental Impact of Language Models

Published in

DarrowAI

5 min readMay 7, 2024

As we witness the rapid advancements in artificial intelligence, particularly through the development of large language models (LLMs), it’s crucial to also consider their environmental footprint. Recent studies, including one notable study by Hugging Face and Carnegie Mellon University, have illuminated the substantial energy consumption inherent in AI-driven tasks. For instance, the energy required for training GPT-4 is equivalent to the carbon emissions produced by driving a gasoline-powered car for nearly 29 million kilometers, which is approximately 3.5 round trips from Earth to the moon. This comparison, highlighted in the report “Power Hungry Processing: Watts Driving the Cost of AI Deployment,” serves as a crucial wake-up call to the hidden environmental costs of our technological advancements.

The evolution of LLMs underscores a critical, yet often neglected concern: their environmental impact. As these models grow in complexity, they require significant computational resources, leading to substantial energy usage and carbon emissions. This issue gains urgency as sustainability and climate action become global priorities. The decisions we make now in developing, deploying, and utilizing AI will shape its ecological footprint, setting the stage for a more sustainable integration of this powerful technology into our lives.

Core Elements of AI’s Energy Consumption

Understanding the carbon footprint of machine learning is crucial for developing more sustainable AI technologies. A variety of interrelated factors define this footprint, each playing a significant role in the overall energy consumption of machine learning systems:

Hardware: The efficiency and type of hardware used, including GPUs, is crucial in determining the energy consumption of ML models. While newer, more efficient hardware can offset some energy demands, the high-end requirements for training large models may negate these gains.

Training data: The scale and complexity of datasets directly influence the energy needed for processing. Larger datasets increase the carbon footprint; thus, optimizing dataset size and complexity is essential for maintaining sustainability.

Model architecture: The design of neural networks crucially impacts their energy consumption. Larger and more complex networks inherently require more energy. Additionally, the duration of training plays a significant role in energy use — the longer a model trains, the more energy it consumes. Streamlining the architecture of these models can lead to substantial reductions in energy usage without compromising their performance. By focusing on efficient model design and optimizing training time, we can take a more direct path to minimizing energy consumption and enhancing sustainability.

Location of data centers: The carbon footprint of data centers depends heavily on the source of electricity. Centers using renewable energy sources have lower carbon footprints; choosing locations with access to renewables is a strategic decision for minimizing environmental impact.

Case Studies of Energy Consumption

The environmental cost of training and operating large language models (LLMs) becomes strikingly clear when we examine specific case studies that illustrate their significant energy use and carbon emissions.

LLMs Environmental Footprint

The training of GPT-4, one of the most sophisticated language models to date, required significant resources, leading to substantial environmental impacts. The carbon footprint generated during its development is comparable to the emissions from driving a gasoline-powered car nearly 29 million kilometers or powering over 1,300 homes for a year.* These comparisons highlight the scale of energy and environmental considerations that come with advanced technological developments.

Calculation based on the use of 25,000 Nvidia A100 GPUs over 100 days, consuming a total of 28,800,000 kWh of electricity when including data center inefficiencies, resulting in approximately 6,912,000 kg of CO2 emissions.

The LLama3 model, though using less energy than GPT-4, still had a notable impact. The total carbon emissions from training both versions of the LLama model (8B and 70B) were equivalent to the energy required to power 452 homes for a year. Additionally, over 2,600 acres of U.S. forest would be needed to sequester the carbon emitted during this training process.

Energy Consumption During Inference

The inference stage of AI, where the model interacts with new data, often requires more energy than the training phase. For example, 60% of the total energy consumed in AI operations is attributed to inference, according to Google. Notably, a day’s operation of GPT-3 alone results in a carbon footprint equivalent to around 8.4 tons of CO2 annually. Given the widespread use of models like ChatGPT, which attracted 100 million users within two months of its release, the energy demand becomes substantial. A single query in ChatGPT may use up to 100 times the energy of a standard Google search, underscoring the high environmental cost of popular AI applications.

What Can We Do?

Choosing the Right Model: Not every AI application requires the power of a large language model. Where possible, opt for smaller, more specialized models that can be fine-tuned to meet your needs. This approach conserves energy and computing resources while still delivering high-quality results. Fine-tuning existing models rather than training new ones from scratch can drastically cut down on both carbon emissions and expenses.

Modifying and Streamlining Model Architecture: Adjusting the architecture of AI models to eliminate redundant parameters and adopt more efficient designs can significantly reduce both energy and storage requirements. For instance, AI21’s Jamba model exemplifies this strategy by achieving a threefold increase in throughput with reduced energy use, showcasing how structural changes can optimize operational efficiency. Similarly, Microsoft’s BitNet b1.58 leverages quantization to drastically lower bit usage per parameter, substantially cutting down energy consumption and carbon emissions without sacrificing performance. These adaptations demonstrate that thoughtful architectural adjustments are key to advancing sustainability in AI development.

Selecting the Right Hardware: Opt for the appropriate hardware based on your specific AI needs. Not every task requires the most powerful GPU; choosing hardware that matches the demands of your projects can significantly reduce energy consumption and cost.

Algorithm Selection: Optimize your AI’s performance and environmental impact by choosing the right algorithms for the task. For large-scale systems, actively employ carbon estimation tools like Code Carbon and ML CO2 Impact to measure and manage your carbon footprint. Encouraging developers to incorporate these assessments into major projects can promote alignment of AI development with sustainability objectives.

Conclusion

As we look to the future, the potential of LLMs is limitless, offering unprecedented opportunities for advancement. It’s imperative that we embrace this potential while ensuring that our use of LLMs is environmentally responsible. By harnessing their power in a more sustainable manner, we not only maximize their benefits but also contribute to addressing pressing global issues like climate change. This strategic approach ensures that our progress is not only innovative but also environmentally conscious, paving the way for a brighter and more sustainable future for generations to come.

References