Machine Learning Intern Journal — A New Year, A New Challenge

As the title indicates, this is the journal of a Machine Learning (ML) intern at the impactIA Foundation. I’ll be attempting to keep a weekly journal of my activities in the Foundation to keep track of my progress and leave a roadmap for the interns who come after me.

Léo de Riedmatten
impactIA
6 min readJan 13, 2021

--

Happy New Year! That might not be the right thing to say in these challenging times. Over the winter holidays, I was lucky to escape up to the snowy mountains to spend some time with my family and close friends. The end of a gruelling year, the start of a new one, the arrival of various promising vaccines. All of this filled me with a sense of false hope that filled the end of the tunnel with light. But that might have been just a temporary reflection. There is still a long way to go before we reach a sense of normalcy, and these trying times demands us to stay strong together. I wish you all the best for this new year.

There is another big challenge for humanity that, although less tangible right now, is knocking at our door: Climate Change. Justin Rowlatt wrote a great article about why 2021 could be a turning point for tackling climate change. You might be wondering why I’m brining up climate change in a journal about my machine learning internship. Well, let’s talk about the ‘Cloud’ and the Internet, and their (often forgotten) environmental impact.

‘Cloud’ Storage and The Internet

It’s easy to be swept up by words like ‘cloud storage’ or ‘wireless transfer’ and not be aware that every cat video, Facebook post, Google searches — literally everything on the internet (and yes, unfortunately that includes Trump’s excess tweets) and in your ‘cloud’ storage isn’t just floating around in the air. It’s stored physically on a hard drive somewhere on Earth. Although carbon footprints are inherently difficult to calculate accurately, several sources show the Internet has a similar footprint to that of the aviation industry. Wait, what?

The main culprits are Data Centres. These range from small rooms to data farms. Rows upon rows, columns upon columns of hard drives that store everything you find on the internet. All of this hardware requires a lot power to run cloud computing and storage, on-demand films, music and entertainment, as well as your emails and old Facebook posts. Furthermore, you know that feeling when you’re using your computer and you start hearing noises and feeling warm, and then notice your computer is suffocating and burning your legs? Imagine hundreds, thousands of those all stacked onto each other. Data centres produce enormous heat which requires equally enormous energy to cool them down. That’s why big companies like Facebook have their data centres in cool climates such as northern Sweden.

But it’s not enough. With the ever increasing hunger for the services the Internet offers, there is a serious energy issue that needs to be solved. A lot of companies are committing to switching over to solely using renewable energy to power the stations, but we need to find smart ways to reduce the energy consumption all together regardless of where the energy is coming from. Let’s take a look at some advances being made in that regard.

DeepMind

The cooling of data centres typically uses large industrial equipment such as pumps, chillers and cooling towers. However, dynamic environments like data centres make it difficult to operate optimally for several reasons, as DeepMind points out in a blog post:

  1. The equipment, how we operate that equipment, and the environment interact with each other in complex, nonlinear ways. Traditional formula-based engineering and human intuition often do not capture these interactions.
  2. The system cannot adapt quickly to internal or external changes (like the weather). This is because we cannot come up with rules and heuristics for every operating scenario.
  3. Each data centre has a unique architecture and environment. A custom-tuned model for one system may not be applicable to another. Therefore, a general intelligence framework is needed to understand the data centre’s interactions.

DeepMind therefore trained an ensemble of neural networks on historical data collected by thousands of sensors (temperatures, power, pump speeds, etc.) around their data centre. This ensemble was given access to the controls of the data centre for a test and the results are mind blowing: a 40% reduction in the amount of energy used for cooling. DeepMind say “We are planning to roll out this system more broadly and will share how we did it in an upcoming publication, so that other data centre and industrial system operators — and ultimately the environment — can benefit from this major step forward.” If you want to learn more, check out this blogpost.

Typical day of testing showing PUE (Power Usage Effectiveness) — Taken from: https://deepmind.com/blog/article/deepmind-ai-reduces-google-data-centre-cooling-bill-40

Microsoft’s Underwater Data Centres

A radical move away from land-based data centres might be on the horizon with Microsoft currently testing underwater data centres. Now, water and electronics aren’t known for playing nice, so why the heck is Microsoft working on put data centres underwater? There are actually many advantages to this approach:

  1. Around 40% of the world population lives within 100 km of the coast, so placing data centres in the water just off the coast would drastically reduce the infrastructure necessary to get the data to the people as well the latency (time it takes for data to travel from source to destination).
  2. Microsoft estimates they could ship and get a data centre up and running in 90 days, as opposed to 18 months — 2 years for land-based data centres.
  3. We talked about how much energy is used for cooling. Well, underwater, the seawater can be reliably cool enough all-year around with much less temperature variation than on land.
  4. These could be powered 100% by renewable energy, namely tidal or wave power — produced and used right then and there.
  5. Since no humans will have access to these underwater data centres, oxygen and water vapour (big contributors to failure of equipment due to corrosion) can be removed and replaced by nitrogen to produce a safe haven for the electronics.

This may sound too good to be true, but Microsoft recently pulled a data centre container out of the water (it had been operational on the sea bed for a year) and concluded that underwater data centres are reliable, practical and use energy sustainably. Microsoft are now fully recycling the unit to demonstrate how they will sustainably terminate data centres once they reach the end of their lifecycle. A very promising approach for the future of data centres. But let’s go further and dream big.

The Future

If you thought underwater data centres was a crazy idea, what do you think about data centres in space? That’s right. Imagine a data centre orbiting the Earth, using the Sun as its power source and the icy vacuum around it to cool it down. Of course, even by ignoring the huge costs of getting a data centre up into orbit, there are still some crucial challenges that will need to be overcome, ranging from connectivity issues to maintenance, to protection from debris and radiation. We’re not there yet, but with SpaceX shattering launch costs, the idea is sounding a little less crazy today than it did a few years ago.

Conclusion

In this blog post we explored the ‘Cloud’ and how its name isn’t very accurate in terms of its actual composition. Data centres make up for a gigantic consumption of energy, and while a lot of companies are working towards sourcing that energy from renewable energy, the ever-increasing demand for data storage requires some progress in energy-reduction. We saw that AI can be used to drastically reduce the consumption by taking over the cooling controls, and that Microsoft is making under water data centres a reality. One day, we might have data centres in space.

In the mean time, you can contribute to lowering your involvement in the energy consumption of data centres by deleting your mail regularly, as well as your old Facebook posts, tweets, old photos or videos you don’t want anymore, etc. Remember that every single piece of information you access on the Internet isn’t just floating around, it’s being stored on a physical hard drive somewhere in the world.

--

--

Léo de Riedmatten
impactIA

BSc in Computer Science & Artificial Intelligence with Neuroscience from Sussex University, currently a Machine Learning Intern at impactIA in Geneva (CH).