Nanomaterials Discovery using Machine Learning

Elanu Karakus
22 min readFeb 27, 2023

--

Since ancient civilizations, nanomaterials have been a part of our lives. By discovering fire, early humans met (but were not aware) nanoparticles that were in the smoke as a by-product. Our ancestors used these carbon nanomaterials for cave paintings.

Egyptians produced “Egyptian Blue” -the oldest known artificial pigment, for hair dye. Egyptian blue pigments were nanoparticles with only 5 nanometers in size.

Egyptian Blue

On 29 December 1959, Richard Feynman opened a whole new field known as nanotechnology by giving a lecture entitled “There is plenty of room at the bottom”.

“What I want to talk about is the problem of manipulating and controlling things on a small scale.” -Richard Feynman

Since then, nanotechnology studies keep accelerating and the discovery of nanomaterials has been revolutionizing our lives.

In 1996, the discovery of fullerenes and in 2004, the discovery of graphene were awarded a Nobel Prize.

In this article, I’ll talk about nanomaterials and how machine learning is being used to accelerate the discovery of nanomaterials.

What to expect?

Nanomaterials

Nanomaterials can be defined as materials possessing at least one external dimension measuring 1–100 nanometers.

Nanomaterials can occur naturally or be produced through engineering to perform a specialized function. And they have become ubiquitous in daily life.

From agriculture to printing, nanomaterials are a part of different industries.

The biggest share of nanoproducts belongs to the electronics sector followed by medicine and cosmetics.

Different industry branches with their share of nanoproducts (Talebian et al, 2021).

Nanomaterials are categorized by their dimensions and sizes.

Dimensions (Kebede and Imae, 2019)

Zero-dimensional (OD): All dimensions are under 100 nm.

One-dimensional (1D): Two dimensions are under 100 nm while the other one is over 100 nm.

Two-dimensional (2D): One dimension is under 100 nm while the other two are over 100 nm.

Three-dimensional (3D): All dimensions are over 100 nanometers. This means they are not technically nanomaterials but bulk materials.

Let’s look at some examples of nanomaterials classified by their dimensions.

Zero-Dimensional Nanomaterials

Nanospheres and clusters such as quantum dots, fullerenes, and metallic nanoparticles are examples of 0D materials.

🚨Material Highlight: Quantum Dots🚨

SEM and TEM images of ZIF-8 (A, D), ZIF-8/GQD (B, E), and DOX-ZIF-8/GQD (C, F) nanoparticles.

Quantum dots are zero-dimensional and semiconductive nanomaterials. They are often called “superficial atoms” as they can be used to understand how atoms are due to their similarity. They have a typical diameter of 2–10 nm. Quantum dots are man-made nanoscale crystals that can transport electrons.

When UV light hits these semiconducting nanoparticles, they can emit light of various colors. This makes them “glow” under the UV light.

Quantum dots have “electronic energy levels” in which electrons can move between.

When a quantum dot is at rest, its electrons are at the lowest energy band they can reach. This is called the ground state.

When a photon hits the quantum dot’s electron, the electron absorbs the energy of the photon and jumps to a higher energy band. Here, the electron is in the excited state.

Quantum dots luminescence when photons hit their electrons and they emit light because they want to give back the energy they received from the photons.

QDs exhibit different colors of emission with changes in size.

For instance, smaller dots have a larger energy difference between energy bands. To get all the way back down to the ground state, electrons must emit a photon with high energy, creating colors with short wavelengths light blue or purple.

Unique properties of quantum dots are used in different topics such as solar cells, bioimaging, cancer treatment…

One-Dimensional Nanomaterials

Nanotubes, wires, and rods such as carbon nanotubes, and metallic nanorods or silver nanowires are examples of 1D materials.

🚨Material Highlight: Carbon Nanotubes🚨

SEM (scanning electron microscope) images of (a) multi-walled carbon nanotube (MWCNT) bundles and (b) agglomerations.

A carbon nanotube is a tube made of carbon typically measured in nanometers. Carbon nanotubes (CNTs) are extremely long, thin cylinders that can be made from sheets of carbon atoms bound together in a hexagonal lattice structure.

CNTs possess a unique combination of high stiffness, high strength, low density, small size, and a broad range of electronic properties from metallic to semiconducting.

CNTs have two types of shapes. The first one is multi-walled and has a structure of nested tubes. The second type is the basic form of a rolled-up graphitic sheet and is called a single-walled CNT.

A vector connecting the centers of the two hexagons is called the chiral vector, and it determines the structure of a single-walled carbon nanotube.

A carbon nanotube can be specified by a chiral index and be expressed as seen below:

A chiral vector

For the vector, n and m are integer chiral indices, and |a1|​ = |a2|​ is the lattice constant of graphite. n and m can be numbers to change the structure of the CNT. The lattice constant is an important physical dimension that determines the geometry of the unit cells in a crystal lattice and is proportional to the distance between atoms in the crystal.

All carbon nanotubes can be called chiral CNTs (their vector is n,m). When m=0, we have a “zigzag” CNT. If the n and m are equal to each other meaning n = m, the shape of our carbon nanotube is called an “armchair”.

Two-Dimensional Nanomaterials

Thin films, plates, and sheets such as graphene are examples of 2D materials.

🚨Material Highlight: Graphene🚨

Scanning electron microscopy images of (A) graphene oxide

Graphene is a two-dimensional flat sheet with only one layer of hexagonal-structured carbon atoms. The carbons are perfectly circulated in a hexagonal honeycomb shape only 0.3 nanometres thick, with just 0.1 nanometres between each atom.

The EU devoted €1bn ($1.3bn) to graphene projects between 2013 and 2023 to find out if it can transform a range of sectors such as electronics, energy, health, and construction.

Graphene stands out for being tough, flexible, light, and with high resistance. It’s calculated that this material is 200 times more resistant than steel and 5 times lighter than aluminum.

Even when it is not one-atom-thick, the material still retains some of the unique properties of the single-layer form up to a thickness of about 10 layers.

Graphene is extracted from graphite -yes, the same material as your pencil.

Graphene repels water, meaning it is not solvable in water.

Graphene oxide is made by treating graphite with strong oxidizers like sulphuric acid, alongside a catalyst. These methods add oxygen atoms to its surface and make the material hydrophilic, meaning it can be dispersed in water.

Graphene oxide

Additional to graphene’s applications, graphene oxide is also important for a broad range of usages. It is used to make transparent films for flexible electronics, solar cells, and chemical sensors such as drug tests.

Synthesis

To synthesize a nanomaterial, you can follow two methods: top-down and bottom-up approaches.

The synthesis of nanomaterials via top-down and bottom-up approaches.

Top-Down Approaches

By dividing bulk materials, we can produce nanostructured materials. Mechanical milling, laser ablation, etching, sputtering, lithography, and electro-explosion methods are examples of top-down approaches.

Let’s look into an example -nanolithography.

Nanolithography is a powerful and versatile tool to fabricate nanoscale patterns.

Nanolithography techniques make use of the properties of light or electrons to create patterns in a substrate. This patterning can be targeted via the use of masks being added to the photoresist in order to protect areas from the incoming light. The pattern is then etched onto the uncovered areas and, if necessary, the previously masked areas can be removed. There are different types of nanolithography. Let’s look at the electron beam technique.

How an electron beam lithography work?

Schematic process of e-beam lithography.

After preparing our sample, the PMMA solution is spin-coated onto the sample and baked to harden the film and remove any remaining solvent.

Selected areas of the sample are exposed to a beam of high-energy electrons while other areas are protected.

The sample is immersed in developer solution to selectively remove resist from the exposed area.

E-Beam lithography machines are costly (around 5 million USD), and take multiple hours to pattern the entire wafer.

Bottom-Up Approaches

By “gathering” molecular/atomic-level structures, we can produce nanomaterials. Chemical vapor deposition, solvothermal and hydrothermal methods, the sol-gel method, and reverse micelle methods are examples of bottom-up approaches.

Let’s look into an example -solvothermal and hydrothermal methods.

In the hydrothermal method, nanomaterials are produced through a reaction carried out in an aqueous medium at high pressure and temperature in a sealed vessel.

The solvothermal method is like the hydrothermal method except it is carried out in a non-aqueous medium. Hydrothermal and solvothermal methods are generally carried out in closed systems.

This is the figure of a solvothermal vessel.

As an example, this study uses hydrothermal and solvothermal methods to produce iron nanostructures. High temperatures and sealed vessels were executed.

Facile synthesis of morphology and size-controlled α-Fe2O3 and Fe3O4 nano-and microstructures by hydrothermal/solvothermal process: The roles of the reaction medium and urea dose. (Minhua et al, 2016)

Industry

Many nanomaterials are being developed to enhance bulk materials with their interesting characteristics such as conductivity, strength, lightness, and so forth (Barhoum et al, 2019).

This affected the global market size of nano products.

Each year, we see a rising in gross revenue generated from nano products in different industries.

According to Global Industry Analysts, Inc., the nanomaterial market in the United States (US) was 2.1 billion USD in 2021, and in China, it should reach 1.2 billion USD by 2026 (compound annual growth rate of 11.4% for the studied period). Similarly, in Japan, Canada, and Germany, the compound annual growth rate is estimated at 8.1%, 8.7%, and 9.1%, respectively, for the same period.

Millions of dollars are being invested to see where this industry goes.

Will graphene become the lighter but tougher alternative for construction materials?

Can we find a way to build a space elevator with carbon nanotubes?

Are quantum dots the solution for improving solar cells?

These are the questions we want to answer.

How to answer them? Simple.

We need to design, manufacture or even discover nanomaterials that can specifically do what we want them to do.

Traditional Discovery and Design of Nanomaterials

Discovering, designing, and manufacturing nanomaterials are NOT simple or cost-effective.

We looked at methods to synthesize nanoparticles -E-beam lithography machines cost 5 million dollars. It is not so cheap, right?

We furthermore have to keep in mind that discovering and designing nanomaterials is a trial-and-fail process in its nature.

In traditional nanomaterials science, experimental and computational simulating methodologies are two main routes to explore nanomaterials, both of which require specialized domain knowledge (Wang et al, 2019).

The steps you would follow would be:

Learn the domain knowledge -> Get access to a lab, equipment, and materials -> Design your experiment -> Execute your experiment -> Measure your nanomaterial’s properties -> If you did not accomplish your goal, start again.

For instance, to build a space elevator, you need a carbon nanotube that does not have any single flaw because researchers at the Hong Kong Polytechnic Institute figured out that even a single atom out of place in the structuring of a carbon nanotube reduces the strength of the structure by dozens to factors.

Space Elevator

But synthesizing a perfect carbon nanotube is limited by experimental conditions, especially with the increasing chemical complexity.

To synthesize complex nanomaterials (like our perfect carbon nanotube), you would need high equipment requirements, an extreme experimental environment, and practical experience.

This raises a problem: We can NOT predict new nanomaterials and properties with the two traditional routes (experimental and computational simulating).

So, we need to develop a new paradigm with both time and performance efficiency.

Machine learning is paving a promising path to accelerating nanomaterials discovery and design.

Nanomaterial characteristics are significantly more difficult to predict than those of conventional materials due to quantum effects at such small sizes, so we can use machine-learning approaches.

Nanomaterials Discovery using Machine Learning

Machine learning is a subset of artificial intelligence, which involves creating machines that are capable of learning from data and making decisions.

A machine learning model is a program that can find patterns or make decisions from a previously unseen dataset.

The general process of how we use machine learning in nanomaterials science has two steps:

  • feature engineering from database to features,
  • model building from features to models.
Machine Learning Process

Before these two steps, we need to gather our data.

The data we might use depend on what outcome we want.

If we want to work on quantum dots, we can find a data set containing different properties of quantum dots.

But how do these data sets form?

One approach is using experimental data. From experiments executed in real life, we can build machine learning models to predict the result and actually confirm if our model worked or not.
The second approach is using simulated data. We can use modeling software or simulations to create our data set.

Researchers are also doing meta-analyses to collect data. This means that they are scrapping different experimental or simulated data from numerous research papers in the literature and collecting it in one large data set.

After we acquire and understand our data set, we continue with “data cleaning.” This means we need to do feature engineering and select the features we want to use.

In feature engineering, some attributes are first generated to represent the raw nanomaterials.

The features are encoded in the available parameters such as structure properties (dimension, crystal structure), mechanical properties (strength, Young’s modulus, hardness), electrical properties (dielectric strength, resistivity, permittivity) magnetic properties, and thermal properties.

Then these features are used as inputs to train a machine-learning model, which is called model building.

In the general process of machine learning, feature engineering is critical to guarantee the performance of models, because only the relevant features are meaningful for the construction of prediction models.

The selection of relevant features for predicting nanomaterials characteristics requires domain knowledge in both nanomaterials science and machine learning.

This decreases the application of ML in nanomaterials discovery and design.

With recent progress in characterization techniques, big data, computational capabilities, and algorithms, some new features of machine learning in nanomaterials science have appeared that can solve this challenge.

By using deep learning algorithms, automated feature engineering becomes possible in recent years.

Deep learning is a subset of machine learning but their capabilities differ.

ML vs DL

Deep learning models don’t need to understand the features.

They find the patterns in a data set and create classifications by themselves.

Unlike traditional machine learning algorithms, deep learning differentiates itself by using a series of nonlinear functions that are hierarchical cascade. The nonlinear function in each level learns to alter its input data into a representation that serves as input for the later layers. So, deep learning algorithms are capable of automatically developing their own set of internal features that are relevant to optimally predict the desired output.

Now that we know how machine learning models work for nanomaterials discovery, let’s look at some examples.

🚨Research Highlight: MIT’s way of finding nanoparticles with high drug-loading capacity🚨

We report the integration of machine learning with high-throughput experimentation to enable the rapid and large-scale identification of such nanoformulations. We anticipate that our platform can accelerate the development of safer and more efficacious nanoformulations with high drug-loading capacities for a wide range of therapeutics. -Reker et al, 2021

A team at MIT is working on building a platform that can propose personalized sets of excipients.

Researchers have employed machine learning to identify pairs of small-molecule drugs and inactive ingredients that will self-assemble into nanoparticles with high therapeutic payloads.

Why?

Nanoparticles can overcome the limitations of small-molecule drugs. Small-molecule drugs are drugs that can enter cells easily because it has a low molecular weight. Once inside the cells, it can affect other molecules, such as proteins, and may cause cancer cells to die. This is different from drugs that have a large molecular weight, which keeps them from getting inside cells easily. Think of nanoparticles as vehicles to get the drugs into the cell.

However, formulating nanoparticles is a complex process.

Endeavors to create nanoparticle formulations have been held back by the incapability to predict which mix of ingredients will have high-loading capacities.

That’s why the team at MIT started using machine learning to identify effective nanoformulations.

The project evaluated 788 therapeutic small molecules and 2686 approved excipients. Those building blocks resulted in 2.1 million pairings. From those pairings, the researchers identified 100 self-assembling drug nanoparticles.

How does their model work?
They extracted chemical structures for all drugs in simplified molecular input line entry system (SMILES) from DrugBank. Data for excipients was extracted from the FDA.

They selected drugs that self-aggregate into colloidal macrostructures as candidates since self-aggregating molecules are more prone to co-aggregation.

Compounds were described and their physicochemical properties were calculated. In addition, short Molecular Dynamics simulations were run and automatically analyzed to assess the enthalpic, non-covalent interaction potential between drugs and excipients.

After these, they had 4515-dimensional numerical characterizations of a formulation that served as input for the random forest classification machine learning model.

The random forest is a classification algorithm consisting of many decision trees. It produces multiple decision trees which are blended together for a more precise prediction.

The reasoning behind the model is that multiple uncorrelated individual decision trees perform much better as a group. When using Random Forest for classification, each tree gives a classification or a “vote.” The forest chooses the classification with the majority of the “votes.”

The researcher's code can be found here.

Their model can evaluate and run a random forest model to predict the self-aggregation propensity of approved drug compounds as candidate compounds for the co-aggregation platform.
It can predict new co-aggregation pairs from screening data.
It can reduce dimensions to showcase the diverse sampling of drugs and excipients for experimental testing.

🚨Research Highlight: The acceleration of carbon-based materials discovery🚨

Scientists at the U.S. Department of Energy’s (DOE) Argonne National Laboratory have recently documented an automated process for identifying and exploring promising new materials by combining machine learning and high-performance computing.

Using the single element carbon, the algorithm predicted the forms in which atoms order themselves under a wide range of temperatures and pressures to make up different substances. From there, it assembled a series of phase diagrams.

A phase diagram is a graphical representation of the physical states of a substance under different conditions of temperature and pressure.

If a material’s atomic structure changes, its electronic, thermal, and mechanical properties change as well. So, a good way to change the atomic structure of a material is to vary the surrounding pressure and temperature.

The study’s algorithm constructed phase diagrams that mapped hundreds of metastable states of carbon.

To understand what a metastable state is, let’s look at an example.

Graphite -a form of carbon- crystallizes into a diamond slowly when we apply extreme pressure and heat. After we remove it from those extreme conditions, the diamond still endures, it is not exactly in equilibrium but it is somewhat stable. This state is called the metastable state.

By mapping known AND unknown metastable states of carbon, we can find new materials.

These are different phase diagrams of the study.

How does it work?

The researchers trained their machine learning model with data they created with simulations. It was produced using Carbon, a high-performance computing cluster at Argonne’s Center for Nanoscale Materials.

The data was used to calculate Gibbs free energy with Deep Neural Networks (DNN).

Gibbs energy, or free enthalpy, is a quantity that is used to measure the maximum amount of work done in a thermodynamic system when the temperature and pressure are kept constant.

DNN was used to learn the Gibbs free energy of different phases of carbon. It consists of 8 dense hidden layers and the relu activation function was used in all layers, except the input and the output layers.

Take a look at their code here.

Using the algorithm’s predictions as a directory, the team verified its effectiveness by synthesizing samples and characterizing them using a transmission electron microscope.

The model successfully indicated well-known phase diagrams for carbon, and the generated phase diagrams affirmed some experimental observations.

“Materials synthesis, especially of those with exotic properties, can often take several experimental trials and years of effort. Our machine learning algorithms allow us to identify the synthesis conditions of exotic materials, potentially reducing the time for their experimental realization.” -Subramanian Sankaranarayanan, a lead author of the study

🚨Research Highlight: Predicting Atomic Coordinates of CNTs🚨

Since the discovery of carbon nanotubes (CNT) in 1991, scientists are rapidly researching CNT’s impressive features.

CNT’s atomic structures are important as they influence different properties like semiconducting, stiffness, etc.

Carbon nanotubes have different properties when their structure is changed. If you want to build a space elevator from a very cheap, stiff but lightweight material, the solution seems like a CNT. But not every carbon nanotube has the same stiffness. They are not always easy to synthesize. To find the best match for your wishes, using molecular modeling programs is the best option.

Simulation programs like CASTEP or VESTA are used to make CNT models with mathematical calculations. However, they need iterations that can make the process of simulating different CNTs longer than it needs to be.

Researchers now use ANNs (Artificial Neural Networks) to predict the atomic coordinates of carbon nanotubes, so that these models can be used within modeling programs to build up new CNTs in a short time.

This study is using ANNs to lower the iterations needed in the CASTEP simulation environment for modeling CNTs.

They formed their data set of different CNT atomic coordination according to their chiral structure from a simulation called CASTEP.

Then, they used the data set as input for a machine-learning model and predicted the atomic coordination of CNTs.

By using predicted atomic coordination, they reduced the time spent on designing carbon nanotubes with simulation software by reducing the number of iterations in the calculation process.

To learn more about this study, you can take a look at this article I wrote replicating the work of Avcı and Acı:

🚨Research Highlight: Toyota and Northwestern’s Data Factory🚨
Toyota Research Institute (TRI) and Northwestern University are collaborating to help accelerate new materials discovery, design, and development with the world’s first nanomaterial data factory.

The challenge they are determined to solve is meeting the demand growing demand for mobility without relying on fossil fuels globally and the whole transportation sector says, Brian Storey, the director of TRI.

Toyota, an automotive manufacturer, depends on materials for all kinds of things. From fuel cells to converters, being able to find the best materials is the key.

Chad Mirkin, Director of the International Institute for Nanotechnology and Professor of Chemistry at Northwestern, says what they are doing with the collaboration is:

We begin by looking at the vast parameter set possible in terms of materials discovery, collect data from that, and then empower machine learning and AI algorithms to allow us to objectively search the materials genome and find the best materials for the given need.

Let’s go into more detail.

Ideally, the team wants to make nanoparticles that have a lot of available surface for catalyzing reactions and use less valuable materials such as Platinum that go into fuel cells.

A fuel cell is like a battery but they do not recharge or run down. They produce electricity as long as fuel is supplied.

There are different types of fuel cells but polymer electrolyte membrane fuel cells (PEMFC) are commonly mentioned for transportation purposes.

Fuel cells consist of two electrodes sandwiched around an electrolyte. Fuel (hydrogen) is fed to the negative electrode (anode) and the air is fed to the positive electrode (cathode). In a PEMFC, a catalyst separates hydrogen atoms into protons and electrons, which take different paths to the cathode. The electrons go through an external circuit, creating a flow of electricity. The protons migrate through the electrolyte to the cathode, where they reunite with oxygen and electrons to produce water and heat.

The more active a catalyst is the more power it will produce for less fuel, so it basically means that the full cell will be more efficient.

Nanoparticles make for good catalysts because they have a high surface area-to-volume ratio.

The higher the surface area to volume ratio, the smaller the particles.

To synthesize the “best” catalysts for the given need, the team uses what they call “nano printers” and “mega libraries.”

Their pen array

They have a nano printer that has over 100,000 pyramidal tips. They spray-coat these printers with chemicals to create pen arrays, where each pen has a different but deliberately chosen quantity and composition of chemicals. A collection of varied structures encoded at specific sites on the array is how the mega libraries work.

They use these nano printers to catalyze carbon nanotube synthesis. With their mega libraries, they see what composites work or not to decide if this is the best option. And they accelerate the process by using the data they get from the mega libraries to predict what can work better.

🚨Research Highlight: Meta-Analysis to Predict QDs Toxicity🚨

The toxicity of nanoparticles is a big conversation while nanoproducts keep spreading into our lives. Oh et al have worked on the meta-analysis of the cellular toxicity of cadmium-containing quantum dots.

Meta-analysis means that they are scrapping different experimental or simulated data from numerous research papers in the literature and collecting it in one large data set.

From 307 publications, they obtain 1741 cell viability-related data samples, each with 24 qualitative and quantitative attributes describing the material properties and experimental conditions.

They are using a random forest regression model to analyze the data and to show that toxicity is closely associated with quantum dot surface properties (including shell, ligand, and surface modifications), diameter, assay type, and exposure time.

Workflow of the study (Oh et al, 2016)

After mining the QD toxicity data with their parameters, the compiled data set was preprocessed to normalize the attributes. Regression models were then developed using the random forest technique for both cell viability and IC50 with the most suitable attributes selected.

Cell viability is defined as the number of healthy cells in a sample and the proliferation of cells is a vital indicator for understanding the mechanisms in action of certain genes, proteins, and pathways involved in cell survival or death after exposure to toxic agents.

The half-maximal inhibitory concentration (IC50) is a measure of the potency of a substance in inhibiting a specific biological function.

Because of the heterogeneity of the data (including both text and numeral), the study chose random forest regression.

We talked about what random forest is earlier but we specified its classification usage.

This study uses random forest for regression.
For classification tasks, the output of the random forest is the class voted by most trees.

For regression tasks, the mean or average prediction of the individual trees is returned.

Results of the study (Oh et al, 2016)

Their random forest regression model found that QD diameter and surface ligand properties affect toxicity the most.

Key Takeaways

Scientific and technical developments are accelerating.

Every day, you can find another breakthrough news.

But there are problems we can not solve, yet.

Ranging from environmental science to quantum computing, most of the problems are here because we don’t have the “right” and the “best” material.

We don’t have the “best” solar panels that can eliminate fossil fuels. We don’t have the “right” quantum computer that can run the algorithms for building protein structures.

This is because of how our process is structured.
Learn -> Try -> Fail does not cut it anymore.

That’s why I introduced you to a revolutionary way to discover and design nanomaterials.

We used to have materials and then apply them to solve problems.
Now, we have problems to solve and we can make materials just how we want them to function.

So, let’s look at some takeaways I want you to get before leaving this article:

  • Nanomaterials are ubiquitous in our lives. From clothing to energy, we are exposed to nanoproducts everywhere.
  • The discovery of new nanomaterials has always been very important. The discovery of both graphene and fullerenes (carbon-based nanomaterials) received Nobel Prizes.
  • We classify nanomaterials by their sizes. Zero-dimensional nanomaterials have <100 nm dimensions such as quantum dots. One-dimensional nanomaterials have only 2 dimensions that are under 100nm such as carbon nanotubes. Two-dimensional nanomaterials have 1 dimension that is under 100 nm such as graphene.
  • To synthesize a nanomaterial, you can follow two methods: top-down and bottom-up approaches. By dividing bulk materials, we can produce nanostructured materials. This is the top-down method. If you “gather” molecular/atomic-level structures, this is the bottom-up approach.
  • Traditional ways of discovering and designing nanomaterials are a trial-and-fail process. This makes it costly as you would need a continuous budget and access to equipment. Traditional ways also require specialized domain knowledge which makes the process uncollaborative and “slow”.
  • Nanomaterial characteristics are significantly more difficult to predict than those of conventional materials due to quantum effects at such small sizes, so we can use machine-learning approaches. ML makes the discovery and design process faster, affordable, and “easier”. It eliminates trial-and-error by giving us predictions of what or how to synthesize our materials.
  • For ML models, you need the data first. According to what you want to study, we can look at different data sets ranging from quantum dots to carbon nanotubes. These data sets are formed by experimental or simulated data.
  • ML is used by applying feature engineering from database to features and model building from features to models.
  • In feature engineering, some attributes are first generated to represent the raw nanomaterials. The features are encoded in the available parameters and used as inputs to train a machine-learning model, which is called model building.
  • A subset of ML, deep learning is also used in nanomaterials discovery and design. The selection of relevant features in machine learning models for predicting nanomaterials characteristics requires domain knowledge. Deep learning does not need feature selection which makes it easier to use and more effective when the classification is need to be done.
  • Machine learning models are used to meta-analyze the toxicity of quantum dots, synthesize small-molecule drugs with high payloads, maximize the efficiency of catalysts in fuel cells, deep learning is used in discovering unknown metastable states of carbon-based nanomaterials, and more.

The nanomaterials industry will be able to accelerate more every day with developments in both AI and nanotechnology.

And who knows? The industry might even exceed our expectations in terms of the money invested in it.
But what we know is we have A LOT OF problems to solve to make our lives better.

In my opinion, a new way of discovering and designing materials is promising.

What do you think? How can we use AI to discover new materials? Let me know!

--

--