Additive Manufacturing Melt Pool Physics Prediction Using Physical Simulation Data

Shiyu Liu
12 min readMar 11, 2020

--

Shiyu Liu, Cangcheng Tang, Yue Zhuang

Data Science Initiative, Brown University, Providence, RI

Abstract

Additive Manufacturing(AM), widely known as 3D printing, normally utilizes physical simulation processes based on numerical PDE and its thermal mathematical model. Sometimes,microstructure simulations, however, could be difficult to scale to macro level for part level prediction. In this project, we used machine learning algorithms, specifically in boosting algorithms and deep neural networks, to predict temperature value as well as melt characteristics. Our result showed that such machine learning algorithms can generate models that are effective and efficient in predicting the expected physics values after training based on the simulation data. In addition, we found thermal theories-based feature engineering and bootstrapping methods could partly improve the prediction tasks for the purpose of correctly predicting less frequent values.

1. Introduction

1.1 General Introduction

Additive Manufacturing (AM) is a relatively new manufacturing process that exhibits many favorable characteristics not possible with subtractive methods. The part quality, however, cannot be well controlled unless implementing full-scale physics simulation. To reduce the complexities of the physics simulation, we aim to provide fast predictions on the melt pool physics to enable process planning and control based on the simulation data.

There are two problems that need to be solved in this project. For the first problem, we predict temperature at a specific point(with x, y, z coordinates) given laser power as well as laser speed. In addition, we also predicted the melt pool length, width and depth given laser power and laser speed.

The second problem, we conduct the same prediction but consider three extra variables: laser angel, laser direction and edge distance. Specifically, there are five equidistant laser angels ranging from 10 to 90 degrees, two directions and edge distances ranging from 0.06 to 1.6. Such extra variables makes the prediction less stable due to more uncertainties.

1.2 Exploratory Data Analysis

1.2.1

Predicting Temperature without Boundary Effect

For task one, we first examined how temperature correlates with the covariates. As shown in Figure 1, the target variable seems to be negatively correlated with all our features.

Figure 1. Correlation matrix for coordinates, temperature, speed and power. The target “temperature” is negative correlated with all features

But this collides with our instinct, as temperature should be positively correlated with power. So we froze speed and examined the relationship between temperature and power, and the result was just as we expected: the two features are positively correlated. And if we freeze power, temperature and speed are negatively correlated. The visualizations of this relationship can be seen in Figure 2, each line represents a certain speed or a certain power.

Figure 2. Line plot for average temperature VS speed and power. Average temperature is negatively correlated with speed, while positively correlated with power.

This aligns with our instinct, as the more power used for heating, the higher the temperature should be. And the faster the laser goes, the less time it spends on heating Ti64, and the lower the temperature should be.

To analyze the relationship between temperature and the x, y, z coordinates, we used a 3D scatter plot for visualization. We can see from Figure 3 that on the top right corner, where all X, Y are large and Z is small, the temperature is relatively high. While when X, Z are larger, Y is small, the temperature is relatively low.

Figure 3. Temperature distribution on the XYZ coordinates. Higher temperatures are clustered at the top right corner.

1.2.2 Predicting Melt Pool Dimensions without Boundary Effect

Similarly, we calculated the correlation matrix for features in Melt Pool Dimensions, as shown below. In Figure 4, we can see that Pool Length and Laser Power are similarly correlated with other features, so are Pool Width and Pool Depth.

Figure 4. Correlation matrix for melt pool dimensions and laser speed, laser power. The dimensions are positively correlated with speed and power.

1.2.3 Predicting Temperature with Boundary Effect

Analysis from task 1 yields similar results in task 2. So for this more complicated problem, we are focusing more on the additional features. First, we performed boxplots on temperature vs laser angle and direction respectively, as shown in Figure 5. The temperatures from the plots below are in the logarithm form, to mitigate the outlier problem. From the graphs, when moving away from the edge, Ti64’s temperature is generally higher. As for angles, 10-degree and 50-degree ones have slightly lower temperatures than the others.

Figure 5. Box plot for temperature VS laser direction and angle. Temperatures are higher when lasers are moving away, and angel 30 and 70 has higher temperature.

Another important feature is edge distance. We performed a scatter plot to reflect the conditional distribution of average log temperature. From Figure 6, we can see that the farther away from the edge, the higher the temperature is.

Figure 6. Dot plot for average temperature VS edge distance. Lasers farther away from the edge generally have higher temperature.

3 Methods and Model Result

3.1 Method

In traditional approaches to analyze a dynamic heat transfer process, we usually start with thermal theories to build a mathematical model, then apply numerical PDE to set up a simulation process, and finally use programming languages to simulate the process.

Additive Manufacturing (AM) process simulations are highly interested in recent years because additively manufactured parts still suffer from tolerance problems, manufacturing defects and subpar strength and fatigue life, therefore unable to be used as production parts. Physical simulations provide reliable and cost effective predictions such as part distortion, residual stresses/strains, microstructure contents and grain morphology, and can be used effectively to guide the product design and manufacturing process for improved parts quality.

However, due to the multi-scale nature of the problem, certain microstructure simulations could be difficult to scale to macro level for part level prediction. In this project, we combine data analytics methods with physical simulations and rules to speed up some of the computational intensive additive manufacturing microstructure simulations, namely the metallurgical phase transformation simulations, by building a ML model based on simulation data to predict the melt pool size and temperature field. The ML model can be used with further programming combining physical rules from experimental diagrams to predict the microstructure properties.

Advantages of ML/DL:

  • Large Dataset: We have a large training dataset for the model, from which an ML/DL model could benefit from.
  • Reusability: As long as a model is built and pretrained, it could be reused in alternative scenarios.
  • Basic Approach: We could use an ML/DL model to gain a general view of how the printer would work and improve the printing accuracy by using further simulations. This could help us save time and effort.

Data Leakage Problem

  • It seems training and validation data tends to have the same feature values, while testing data does not. This may lead to an incorrect high prediction score on the validation set. We have carefully created our own training and validation data from the original dataset.
  • To avoid the impact of possible data leakage, we would prefer to perform customized parameter tuning rather than GridSearchCV in this problem.

3.2 Models

In Case 1, we will have no edges involved in our printing process, so that the heating would be quite predictable due to the fixed path the Laser has walked through. We should expect a model with high accuracy.

3.2.1 Model for Temperature Prediction without Boundary Effect

In model 1, we aim to predict the temperature of a point given the Laser’s power and path as well as the coordinates of the point.

Baseline model: Linear Regression gets an R2 score of 0.384.

Neural Network: Tuned and trained a DNN to predict temperature. The architecture is multiple dense layers with 2, 4, 8, 16 neurons for each layer, with ReLU as activation and he_normal as initialization. The r2reached .9936 for temperature prediction. But the time needed to tune hyperparameters and structures is too long, so we decided to move on to tree-based models.

Tree Model: We applied both a basic random forest model and a gradient boosting(CatBoost) model for the data set. Basic random forest gets an R2 score of 0.993.

For the gradient boosting algorithm, we used all data for training and parameter tuning process and r2 reached .9990 for the validation dataset. The gradient boosting algorithm performs best among the three machine learning algorithms. Figure 7 visualizes the difference between predicted value and the real value.

Figure 7. Predicted temperature versus real temperature for model 1 task 1. Real values and predicted values are very close.

3.2.2 Model for Melt Pool Dimensions without Boundary Effect

In model 2, we would like to make a prediction of the size of the melting pool, due to limitations of the coordinates of data in model 1, we would have to try other methods instead of simply selecting a melting temperature for all.

Baseline Model: Linear Regression gets R2 scores for (length, width, depth) = (0.999, 0.965, 0.964).

Neural Network: Tuned and trained a DNN to predict length, width, depth at the same time. The architecture is multiple dense layers with 8, 9, 10, 16, 16, 32, 64, 128 neurons for each layer, with ReLU as activation and he_normal as initialization. The r2reached .9995, .9987, .9978 for melt length, width and depth predictions.

Tree Model: We applied both a basic random forest model and a gradient boosting(CatBoost) model for the data set. Random forest gets R2 scores for (length, width, depth) = (0.999, 0.994, 0.992).

For the gradient boosting algorithm, we used all data for training and parameter tuning process and r2 reached .9993, .9999, .9998 for melt length, width and depth, respectively in the validation dataset. As shown in Figure 8, the gradient boosting algorithm performs best among the three machine learning algorithms.

Figure 8. Melt length, width and depth prediction versus real values. Real values and predicted values are very close for length, width and depth.

In Case 2, we are facing the issues that when the Laser is near the edge of the material, heat conduction would not be simply related to the features in Case 1, but also to edge information. We would expect that a simple model might not work pretty well in this case, feature engineering and parameter tuning would be needed.

3.2.3 Model for Temperature Prediction with Boundary Effect

Similar to model1 in Case 1, we will predict the temperature of a point. Based on knowledge from Case 1, we would focus on tree models here.

For the gradient boosting algorithm, we used part of data for training and parameter tuning process. In specific, we randomly chose 200 csv files for training and for testing. This is because training one CatBoots requires a large amount of time in such circumstances. We tried a few hyperparameters and selected the one with the highest validation r2. For the validation dataset, r2 reached .9949. The gradient boosting algorithm performs best among the three machine learning algorithms. The visualization of predicted value VS real value is shown in Figure 9.

Figure 9. Predicted temperature versus real temperature for model 1 task 2. Prediction is accurate when real values are small, but lower when real values are larger than 0.5.

Since a slightly poor prediction was observed at a larger value of temperature. We used a bootstrapping method to obtain more samples at higher temperatures value. The predictions of observations at high real temperature values were improved but resulted in a slightly lower r2(.9992). Therefore, a tradeoff exists here between high overall prediction accuracy and better prediction at observations with high-temperature value. The head is a bit off in Figure 10, while the tail is off in Figure 9.

Figure 10. Predicted temperature versus real temperature with bootstrapped training data. The prediction is more accurate at the end but deviance is larger for smaller values.

3.2.4 Model for Melt Pool Dimensions without Boundary Effect

Similar to model2 in Case 1, we will predict the size of the melting pool. Based on knowledge from Case 1, still, we would focus on tree models. For gradient boosting algorithm, we used all data for training and parameter tuning process and r2 reached .9928, .9845, .9986 for melt length, width and depth, respectively in the validation dataset. The gradient boosting algorithm continues to perform well, as shown in Figure 11, the dots are very close to the diagonal line.

Figure 11. Melt length, width and depth prediction versus real values for model 2 task 2. Real values and predicted values are very close for length, width and depth.

3.3 Model Enhancement: Applying PDE Finite Element Analysis and Thermal Theories to do Feature Engineering.

3.3.1 PDE and Thermal Theories

Basic formulas for PDE and numerical PDE:

Where F(t,x,y,z) implicates the heat source in this dynamic system (which in our model is a moving point).

Using Finite Element Analysis and Finite Difference Methods, we may come to an equation for the most general one-dimensional case without a heating source,

Where u(j, n) implicates the temperature of point j at time t.

The insights we gained from those equations are (i) temperature is related to the heat it gained from the source. (ii) temperature is related to the temperature of its neighbors.

Insights for Case 2 with edges (what affects the temperature):

  • At the borders, the way heat transfer does not follow a general case since

as air is not a good material for heat transfer. Intuitively, while the heat meets the border, it cannot spread out to the air as easily as inside the printing material, which makes the border temperature increase.

  • The direction of Laser does matter in this case, as for those Lasers coming backward, the material is experiencing heating for a second time; those heat stored inside would help increase the temperature.

3.3.2 Feature Engineering

Based on the insights from PDE and thermal theories, we suppose that these features may help improve the prediction of models.

  • Inversion of distance to the Laser point
  • Inversion of distance to the edge
  • A measure of heat that could not easily spread out(by volume accumulative)
  • Roughly use: accumulated volume = edge_dis³/sin(alpha)
  • The area of the border(by surface area accumulative)
  • Roughly use: accumulated surface = edge_dis²/sin(alpha)
  • The inverse sin and distance

Here we used a subset of the training data and applied easy random forest to test the efficiency of generated features.

We have found out that Volume, Surface, Distance, Inverse sin (even roughly estimated) are 4 important features for this model that could bring a 6.9% increase to the R square score of the original model, which was a great improvement since the original model has already had an R2 of 0.868. We also believe that generating more informative features based on physics and maths understandings would be quite helpful. This should also work for model 2 in case 2.

3.3.3 Enhanced Model

Actually, when predicting the test data in Case 2 model 2, it appears that there exists data leakage, making our validation score much higher than it should be. To avoid this, we have generated our validation data by complete splitting validation and testing. The prediction on validation data turns out to be unsatisfactory with an R2 score of (.900, .878, .958). The predictions of the second model, shown in Figure 13, is much more accurate and stable than the first model, shown in Figure 12.

Figure 12. Melt length, width and depth prediction versus real values for model 2 task 2 after considering data leakage before conducting feature engineering. When encountered with different angles, the model is biased, and prediction error is very large.

By introducing the generated features into our dataset, we could greatly enhance our prediction of the melting pool size actually. Given R2 scores of (.968, .932, .968).

Figure 13. Melt length, width and depth prediction versus real values for model 2 task 2 after considering data leakage after conducting feature engineering. The bias problem is alleviated to great extent, and the error of prediction is much smaller.

--

--