Stop doing dirty ML; Sustainable AI/ML Guide
Sustainable ML Episode 1: Identify > Evaluate > Optimize
Are you sure you are sustainability doing ML? In this three part series, we’ll understand what's dirty AI/ML workloads’ carbon footprint and steps to sustainable AI.
Episode 1, identify the impact of your ML workload, evaluate alternatives and optimize further.
Episode 2, reduce environmental impact during ML model development, training, and tuning.
Episode 3, further reduce the environmental impact of your ML production workload.
Please consider a follow for not missing Episode 2, 3 and similar articles in coming days.
In Episode 1, you will learn about:
- Sustainability Warnings: Why you need to be sustainable right now.
- Sustainability Pillars: How to save your pockets and environment at the same time using 6 pillars.
- Identify business goal, Evaluate ML problem, Optimize data processing
1. Sustainability Warnings ⚠
Deep learning (DL) models like GPT-3 can yield excellent results on many tasks, but it requires training models for long durations on specialized hardware. This energy-intensive workload has grown dramatically
for which, ML may contribute to significant climate change.
Studies: Carbontracker study found out that training an OpenAI’s giant GPT-3 text-generating model produced 85,000 kg of CO2 equivalents, the same as a car automobile travelling 700,000 km, or twice the distance between Earth and the Moon. Let's see the parameters we start our evaluation.
2. Sustainability Pillars ✅
The AWS Well-Architected Framework explains the do’s and dont’s for your ML workload decisions. These six pillars are Operational excellence, Security, Reliability, Performance efficiency, Cost optimization, Sustainability
3. 1) Identify business goal, 2) Evaluate ML problem, 3) Optimize data processing
Lets deep dive into the First 3 pillars of sustainability to evaluate the ML workloads in this episode
- Identify the Business Goal 🎯
You should have a clear idea of the problem, and business value by solving that problem. Measure business value vs specific objectives and success criteria.
- ML problem framing ✔️
An important step is to determine whether ML is the best option to solve this problem. When a simpler, more sustainable strategy may succeed just as well, there’s no need to deploy computationally intensive AI.
a) Always consider pre-trained models and fine tune them further. Websites like Model Zoo, TensorFlow Models & datasets, PyTorch Hub, Papers with Code, AWS, Hugging Face 🤗 saves your time, cost and environment.
b) Choose data centers, those utilizing wind and solar energy. AWS provides many data centers options that use clean energy. Similar options are available by Azure, GCP and some other cloud vendors.
3. Data processing (includes data collection, data preprocessing, feature engineering) ⚙️
For data collection, find pre-cleaned data. you may not need to waste resources to clean the raw data. Try finding clean open data on sites like Open Data, Kaggle, Google dataset and 65+ sites listed here.
For pre-processing, go serverless. You might be using cloud instances, utilized only during specific operations. Try to find the serverless alternatives. With major cloud providers, you can orchestrate a complete data preparation pipeline using serverless services. Listing below few blogs to refer AWS, GCP, Azure.
Implement data lifecycle policies. Classify data to understand its impact on your workflow and business. This information can help you cycle data to energy-efficient storage or remove it safely. These are options provided by AWS, Azure, and GCP.
Concluding this blog with these three steps on ML pipeline to start with sustainability goal. You learned how to narrow down business objectives, values and outcomes. Learned the importance of using pre-trained models while choosing clean energy. Finally, got to know the resources to find clean data, lifecycle data and using serverless orchestration.
I am in the process of writing the next two episodes, stay tuned. Cheers.
Please support me below to keep bringing such free content and add value to the community.
Donate to Sameer Goel
Help support Sameer Goel by donating or sharing with your friends.
Sameer Goel is creating free Articles, Blogs, E-Books, Study Guides and Roadmaps
I contribute amazing free Articles, blogs, cheat sheets, study guides, roadmaps, and E-Books to support our technical…