Open Catalyst Project: Using Machine Learning to Accelerate the Search for Low-Cost Catalysts for Renewable Energy Storage. Part 1

Shuvam Das
deepkapha notes
Published in
5 min readFeb 23, 2023

Machine Learning for Identifying Catalysts and Materials for Renewable Energy Storage

Converting wind and solar energy into fuels like hydrogen is an important approach to storing and transporting renewable energy. This can be achieved through electrolysis, which uses electricity from renewable sources to split water molecules into hydrogen and oxygen gases[1]. Overall, converting wind and solar energy into hydrogen is a promising approach to storing and transporting renewable energy, and can contribute to the transition towards a more sustainable and low-carbon energy future. In this section, we discuss various data sources available for machine learning applications in identifying low-cost catalysts for renewable energy storage. Also, in hydrogen storage material discovery and high capacity materials for hydrogen storage.[2]

Open Catalyst Project: Using Machine Learning to Accelerate the Search for Low-Cost Catalysts for Renewable Energy Storage

As the world’s energy needs continue to grow, renewable energy storage solutions are essential to mitigate the impact of climate change. One promising approach involves converting renewable resources, such as wind and solar energy, into fuels like hydrogen, but this transformation requires catalysts that can drive the necessary chemical reactions at high rates and low costs[3]. Finding suitable catalysts has traditionally been a time- and resource-intensive process, with conventional methods unable to evaluate more than tens of thousands of chemical structures per year, while there are billions of possible combinations of elements to test. To address this challenge, the Open Catalyst Project was launched as a collaboration between Meta AI and Carnegie Mellon University’s Department of Chemical Engineering.

The project aims to develop machine-learning models that simulate chemical reactions and accelerate the discovery of low-cost catalysts. However, one of the roadblocks for researchers developing these machine-learning models has been a lack of sufficient training datasets. As part of the project, the world’s largest training dataset of materials for renewable energy storage, called OC20, was open-sourced. The dataset contains billions of data points, and its release has made considerable progress in this field. But the challenge remains to find better catalysts to generate green hydrogen fuel via wind and solar energy, which is a critical chemical reaction in the production of renewable energy.[4]

This simulation, magnified 108 times, models a typical relaxation between an adsorbate (in color) and a catalyst’s surface. While it looks simple, current DFT methods take hours or even days to calculate all the quantum mechanical forces interacting at this level.

New OER Dataset for Low-Cost Catalysts: Advancing Renewable Energy Technologies through Machine Learning and Computational Chemistry

To overcome this challenge, a new dataset focusing on oxide catalysts for the Oxygen Evolution Reaction (OER) has been announced. The OER dataset contains approximately eight million data points from 40,000 unique simulations, spanning a variety of oxide materials across 52 elements. The dataset explores interactions between the surfaces of oxide materials and important molecules involved in OER, as well as surface interactions with other molecules. The interactions on the surface are also analyzed when crystal defects and multiple molecules are present. The dataset and baseline models will be open-sourced to help the global scientific community advance renewable energy technologies.[5]

Quantum mechanical simulation tools such as density functional theory (DFT) are typically used by researchers to predict the adsorption energies of small molecules on potential catalysts. However, this process takes hundreds of hours to complete on a multicore machine. Machine learning can accelerate this process, replacing DFT simulations that currently take hours or days with predictions that take a few seconds. These machine-learning models must be trained on a dataset that matches DFT-predicted configurations or energies.[6]

Advancements in Computational Chemistry and Machine Learning for Hydrogen Storage Material Discovery

The process of generating the new OER dataset required tens of millions of compute hours. The carbon emissions from the compute resources used to generate the dataset were committed to being 100 percent offset as part of Meta’s Net Zero program. The Open Catalyst Project aims to facilitate scientific progress in the field of computational chemistry by providing an open-source dataset that allows researchers to overcome the computational limitations of previous methods. The hope is that the project will help the community discover promising new materials at scale, specifically in the development of low-cost catalysts for the energy industry. Such catalysts are critical for addressing global energy needs while reducing the impact of climate change, and the project’s efforts can accelerate progress in this area.

The study also notes that there has been a shift in automobile engine development, from large gasoline engines to smaller turbocharged ones, and from conventional gasoline-powered vehicles to hybrid and electric ones. Hydrogen fuel cells have also emerged as an alternative, thanks to their clean fuel source and electrochemical reactions. Adsorptive hydrogen storage using nanoporous materials (NPMs) such as zeolites, carbon-based materials, and metal-organic frameworks (MOFs) has been explored to address the problems of high-pressure storage tanks, including high production costs and safety hazards such as leakage and explosion. High-throughput molecular simulations have been used to screen the adsorption properties of a large number of NPMs. Machine learning algorithms have been used to predict adsorption properties from structural information of NPMs to reduce the total computational cost of the high-throughput screening process.

Generic fuel cell electric vehicle (FCEV) powertrain topology includes fuel cell system (FCS), HV battery, electric drive, supercapacitor (SC), transmission, LV auxiliary loads, onboard charger, and related power electronics converters.[7]

Most simulation and machine learning approaches to nanoporous material (NPM) discovery focus on predicting the property at the same thermodynamic state for all materials, which is insufficient to accurately rank hydrogen storage among many NPMs because of the non-monotonic relationship. To overcome this limitation, the need arises to predict adsorption at multiple temperatures and pressures, resulting in a big “meta-dataset” containing many small adsorption datasets. The adsorption isotherm function (AIF) with temperature dependence is fitted individually to each dataset to enable prediction of the adsorption property at any thermodynamic state, leading to the discovery of NPMs [8] with higher hydrogen storage working capacities. A team of researchers from the Technical University of Denmark and other institutions have developed a machine learning model that uses an encoder-decoder architecture to generate a latent representation of the data for each sorbate-sorbent system, with only the fingerprint with five coefficients needing to be stored for each material. Monte Carlo simulations are used to obtain the hydrogen loading data as the meta-dataset, which includes all-silica zeolites, hypothetical all-silica zeolites with high predicted synthesizability, hypothetical metal-organic frameworks, and configurations drawn from nine different hard-templating carbons.

REFERENCE

[1]https://www.energy.gov/eere/fuelcells/hydrogen-production-electrolysis

[2]https://www.sciencedirect.com/science/article/pii/S2211467X19300082

[3]https://www.un.org/en/climatechange/raising-ambition/renewable-energy

[4]https://ai.facebook.com/blog/accelerating-renewable-energy-with-a-new-data-set-for-green-hydrogen-fuel/

[5]https://www.cheme.engineering.cmu.edu/news/2022/04/19-new-data-set-accelerates-search-renewable-energy-sources.html

[6]https://pubmed.ncbi.nlm.nih.gov/36304919/

[7]https://www.mdpi.com/1996-1073/15/24/9557

[8]https://www.researchgate.net/publication/338635870_Data_Mining_for_Binary_Separation_Materials_in_Published_Adsorption_Isotherms

--

--