Data mining uncovers hidden interactions

When debugging a complex manufacturing process, machine learning and data mining can uncover hidden causal relationships, dramatically shortening the search for root causes

Can you determine what is causing a problem in a complex manufacturing process without attempting to understand the process itself?

If you are a process engineer, this question probably sounds nonsensical. How are you supposed to solve a problem if you’re not attempting to understand the process that causes it?

The reality, though, is that some fabrication processes are becoming too complex for mere mortals to grasp both in their entirety and in sufficient detail.

It can take up to three months and hundreds of process steps to manufacture a modern semiconductor chip. These physical and chemical processes have to stay within narrow bands — tolerances are measured in units that start with ‘micro’ or ‘nano’.

A few nanometers make the difference between a working chip and scrap.

Such process complexity forces engineers to specialize, which makes root cause analysis challenging and time-consuming if an issue is caused by non-obvious interactions between different process steps.

With the successes that machine learning and data mining techniques have produced in other fields (for example in correlating diseases to the presence of certain genes), a team of ASML and STMicroelectronics set out to explore if those techniques could also be applied to the manufacturing process of chips (also called “integrated circuits”, or ICs).

For the study, the team picked excursions in overlay — a key process parameter that measures how well one layer of electric circuits on a chip is aligned, and thus connected, to another layer. In modern semiconductor manufacturing, overlay tolerances are extraordinarily tight. A few nanometers make the difference between a working chip and scrap.

Illustration of a chip design highlighting two key process parameters, overlay and critical dimension (CD).

ASML’s data science team faced two major challenges.

“IC manufacturing processes are already very stable,” said Manuel Giollo, a data scientist at ASML. “The data points are mostly identical, except for a few outliers. In machine learning, this is called ‘data imbalance’. Or, to put it more simply: This is a ‘needle in the haystack’ problem.”

The second challenge is missing information.

In semiconductor manufacturing there is a wealth of process parameters that can, in principle, be measured. However, taking those measurements is not free. In most cases, it requires separate equipment and adds time to the production process. IC manufacturers are therefore looking to measure enough to be able to safeguard the quality of their products, but not so much that it hurts the bottom line or the output of the factory.

Imbalanced data sets, missing values

As a result, dense sampling is usually confined to the early stages of setting up a production process or to calibrating new equipment. Once a process is qualified for manufacturing, process engineers assume that the equipment shows only limited variation, and they reduce the number of measurements.

In the sample of 900 wafers that ASML and STMicroelectronics used for their study, only about 25% of the 135,000 possible measurements were actually taken due to those constraints.

“Dealing with imbalanced data sets with a large number of missing values is a tough challenge, which requires robust statistical techniques,” Giollo said.

But even with the deck thus stacked against success, the machine learning and data mining techniques were able to produce insights.

A relatively simple (regularized) linear model was able to identify 10 process parameters out of 150 that were the most likely candidates to explain the overlay variation that was observed. When plotting the values of the two most important measures against each other, a clear pattern emerged.

“Knowing where to look is already a big step towards solving an issue.”

“Without any knowledge of what’s happening in the fab, just by looking at the data, you can identify two high-risk regions. Certain combinations of two parameters appear to push overlay out of spec,” said Giollo.

By plotting the values of the two most important measures, two high-risk regions can be identified

“Now, what’s crucial here is that it isn’t obvious that variations of those two parameters would cause an overlay issue. The model has uncovered a hidden causal relationship. This is exactly what you want — for the model to point you in a new direction, not tell you something you already know.”

For process engineers, machine learning techniques can thus be another tool in the tool box to speed up diagnostics and identify the process steps that cause excursions.

“Of course, this is not a complete root cause analysis. But knowing where to look is already a big step towards solving an issue. In this case, physical simulations confirmed that variations of the two selected parameters, in combination with a third, caused the overlay excursion,” said Richard van Haren, ASML’s on-product overlay architect and project leader for the cooperation with STMicroelectronics.

In a next step, process engineers could work out how to optimize the settings of their manufacturing equipment to avoid those high-risk areas. They could also decide to more closely monitor those parameters and automatically flag wafers for rework if the model predicts that there is an overlay issue.

Nevertheless, the approach taken in this study showed limitations.

“Our simple linear model was able to effectively rank root causes, but is not sufficient for accurate prediction — for that, you would have to use non-linear functions. This could be an area to look into further in the future,” Giollo said.

“More challenging, and more interesting perhaps, is the question whether you can generalize. If you are changing the process, to make a new product for example, does the model still hold? Can the model built from the data of one layer be used for another layer? That would of course be even more useful for process engineers, since they wouldn’t first have to collect fresh data for every change that they make.”


ASML’s applied data science team consists of seven data scientists with backgrounds in computer science, applied statistics and machine learning. The team uses analysis and modelling to develop methods and software to optimize the performance of ASML’s lithography systems and closely match the performance of different systems to each other. They also assist ASML’s customers with root cause analysis.

The team is planning to expand, so if you’re interested, keep an eye on the job postings on ASML.com.

Further information on the study is available in a paper published at the EMLC conference. Other publications of the team include lithography data factor analysis and visualization.