ML and Data Science: Empowering Materials Science

Cutting-Edge Tech, Accelerated Research

Ashwin Shekhar
Mettle, NIT Trichy
5 min readJan 3, 2020

--

Materials are said to govern the limits of what human engineering can achieve. Therefore, it is imperative that the development of newer materials — materials which perform better, last longer and serve niche applications — is accelerated to keep up with the growing demands of society.

Source: ACS

The ‘traditional’ approach in the materials science community has always been rather straightforward. Through the careful study of a material’s microstructure and its existing set of properties, a correlation is drawn between the two, known as a structure-property correlation.

Over years of work, we ascertain these correlations and are hence able to select appropriate materials for different applications. If we want to discover a new material, we subject substrates to some processes in the laboratory, look at the structure formed, use existing correlations to predict its properties, confirm said properties, and evaluate its performance over time.

However, discovering new materials by this fashion is a very slow process; therefore, the existing material can also be endowed with a different set of properties by changing the way we process the material — for example, steels are subjected to the familiar heat treatment processes, giving them more desirable properties.

This approach is clearly illustrated and explained using what is called the Materials Science Tetrahedron:

Source: Malvern Panalytical

In essence, by drawing the links between processing-structure-properties-performance, through the appropriate characterisation techniques, one can design novel materials for whatever they want — in theory.

As mentioned, this is a very arduous process, rife with experimentation, taking a lot of effort and resources. As with any laboratory technique, there are no guarantees; most materials synthesised in the lab take years to even reach the general public.

So what could we do to make this crucial process faster, more streamlined?

This is where the often-heard terms — machine learning and data science — enter the picture. They aren’t mere buzzwords being thrown around to add credibility to research — they’re being used to revolutionise the way we synthesise new materials.

We hear these terms thrown around a lot these days and considering their relevance to the field of materials science in general, it’s probably a good idea to know what they mean and what these areas of study are dedicated to.

Source: Forbes.com

Machine learning is a subset of artificial intelligence, wherein computer systems perform tasks without explicit instructions, instead relying on complex patterns and inference.

Data science, in a broad sense, aims to get insight and knowledge from existing data — this is the part among the two that would require domain knowledge to optimise results.

Why are the above relevant, you ask? Well, structure-property correlations are quite complex on their own; over time, materials science has indeed adopted a computational approach for investigating the same, but these, too, are largely limited to first-principles calculations — calculations based on quantum mechanical or thermodynamic principles prone to inaccuracy.

What if, through the tools of machine learning and eventually data science, we are able to accurately predict properties of materials without actually performing these cumbersome calculations? What if we need not bother ourselves with this complex web of linkages between the many parameters that one can vary while processing a material?

Think about it. There is so much going on in your typical material synthesis process; be it the temperature at which it is carried out, the cooling rates employed thereafter, the method in which the sample was shaped (extrinsic factors) or be it the lattice parameters of the final product’s crystals, grain size, morphology, density and so on (intrinsic factors).

Source: CORE Materials

This hypersensitivity to parameters, if you will, in materials, makes them incredibly difficult to work with. However, armed with the tools of machine learning, you can vanquish this problem — often dubbed the curse of dimensionality — with relative ease! The machine does your job for you, constructing a ‘model’ to fit the data you provide, making those correlations and assigning weights to each parameter, minimising losses, constantly striving to be accurate with each iteration.

At the end of it, you have a ‘model’ that can quite comfortably predict the final desired property (it can be anything you want, ranging from the yield strength to the specific capacitance of a material) with a relatively good degree of accuracy.

Now that this model is proven to be effective, all you need to do is simulate your way through, inputting as many parameters as you’d like until you get the maximum desired output.

All of this without even leaving the comforts of your chair. All of this without spending money on hundreds of possibly failed experiments.

Of course, for this rather ambitious plan to work, you need data — available, open-source, free for all. Most scientific journals do not provide this at liberty. There is a discernible gap that prevents this promising idea from becoming a reality.

Machine learning fails when the data provided to it is less or if it is riddled with pre-existing bias. One needs to clean the data, as it were, to prevent this from occurring. When there is a dearth of data to begin with, how can this happen?

Efforts are already being made in this regard, making data more democratic and accessible. The Materials Genome Initiative by the US Government, for example, is one such move to help this cause.

As data becomes widely available, this novel approach can truly take off, and with it, a veritable (and exciting!) materials revolution.

The future of science and engineering is multidisciplinary. Embracing the tools offered across disciplines that empower each other is the way forward. Confining ourselves to boxes and labels, restricting ourselves to one subdomain — these are things that could prove disastrous.

Change is on the horizon. Let’s welcome it.

Learn more by reading this comprehensive review.

--

--