Simulating Complex Physics with Graph Networks: Step by Step
By Haochen Shi, Peng Chen, Shiyu Li as part of the Stanford CS224W course project
Overview
Many important tasks require high-resolution simulations of complex physics. For instance, scientists working on model-based planning would like to know how objects with complex physics such as deformable objects will transform after an environment change (e.g., pinched by a robot arm’s gripper, as shown in this work[3]). Also, the movie industry is eager to generate realistic special effects such as massive explosions to produce more eye-catching scenes.
A promising method to generate simulations of complex physics is to train a GNN (Graph Neural Network) that can learn how to simulate. In this tutorial, we will introduce why this method is suitable to handle this task, and a step-by-step guide on how to build a GNN that can simulate complex physics. This tutorial is based on the paper Learning to Simulate Complex Physics with Graph Networks by DeepMind[1].
Why not traditional physics simulator?
Although traditional physics simulators are powerful, there are some important drawbacks with them: 1) it is expensive and time-consuming to get high-quality results of large-scale simulations for traditional physical simulators; 2) to set up physical simulators, we need to have full knowledge of the physics parameters of the object and the environment, which are extremely hard to know in some cases.
To address these problems, many computer scientists are devoted to using machine learning methods to simulate complex physics. These machine learning models can learn the complex physics of objects directly from visual input without additional information from humans and make accurate and efficient predictions on the objects’ future states.
Why is GNN a good choice for this task?
Input to a GNN
Graph Neural Network (GNN) is a special subset of neural networks that take less structured data, such as a graph, as input, while other neural networks like Convolutional Neural Network (CNN) and Transformer, can only accept more structured data (e.g., grid and sequence). By “less structured”, it means that the input can have arbitrary shapes and sizes and can have complex topological relations.
In particle-based physics simulation, we have the unstructured position information of all the particles as the input, which inspires the idea of using a GNN.
Permutation equivariance
One key characteristic of GNN which distinguishes it from other neural networks is permutation equivalence. That is to say, the nodes in a graph do not have a canonical order, so how we “order” the nodes in a graph does not impact the results produced by GNNs.
Since particles of an object are “identical” in the particle-based simulation, they are permutation-equivariant when applying physics laws on them. Therefore, a permutation-equivariant model such as a GNN is suitable to simulate the interactions between particles.
The Structure of GNN
The structure of a GNN layer also contributes to our decision of modeling the physics with a GNN. The key steps in each layer of GNN are:
Intuitively, a particle-based physics model maps to the above structure of GNN perfectly: 1) nodes represent physical states of particles; 2) message-passing through edges resembles pairwise interactions between particles; 3) neighbor aggregation simulates the total effects of all the interactions on a particle from its neighbors; 4) the update step updates each particle’s position after calculating all the interactions.
Dataset
The dataset can be found here. It contains exhaustive scenarios, including the trajectories of particles with different physical properties (sand, goop, and water) in different scenarios (shaking and dropping). Our task is to predict trajectories of particles given initial conditions of particles and global properties of the system.
We select the WaterDrop dataset for the purpose of this tutorial.
Task Settings
The input vector of each particle consists of the current position of the particle, velocity of the particle in the past 5 steps, the features that represent the physical parameters of the particle, and, if applicable, the global property of the system (e.g., external force). The target of the GNN is to predict the acceleration of each particle. With the accelerations predicted by the GNN, an Euler integrator updates velocities and positions of particles.
Loss and Metric
Both the loss function and the final metric are defined by Mean Squared Error (MSE). The GNN outputs accelerations of particles, so the loss function is the MSE of accelerations, i.e.,
However, the learned simulator, which is composed of the GNN and the Euler integrator, outputs the positions of particles instead of accelerations. Thus, the metric of the learned simulator is defined as MSE of positions as follows:
The Structure of GNS
The model in this tutorial is Graph Network-based Simulators(GNS) proposed by DeepMind[1]. In GNS, nodes are particles and edges correspond to interactions between particles. GNS consists of three parts, an encoder, a processor, and a decoder. To illustrate the structure of GNS in a more clear way, we refactored the open-source TensorFlow implementation by DeepMind into a more concise PyG version.
The Encoder
The encoder preprocesses data for the GNN. It adds an edge between any two particles close enough below a threshold R. Edge feature is the relative displacement between two particles. Before feeding these features into GNN, it encodes node features and edge features respectively through MLP.
The node encoder masks out the absolute position information of particles and encodes the velocity and physical parameters of particles into node features. The edge encoder encodes the relative position displacement between particles and their neighbors into edge features.
The Processor
The processor is a GNN with a stack of identical Interaction Network (IN) [2] layers. In each IN layer, both nodes and edges have hidden representations.
GNN layers such as a IN layer can be easily implemented in PyTorch Geometric (PyG). In PyG, a GNN layer is generally implemented as a subclass of the MessagePassing class. We follow this convention and define the InteractionNetwork Class as follows
1) Construct a message for each edge of the graph. The message is generated by concatenating the features of the edge’s two nodes and the feature of the edge itself, and transforming the concatenated vector with an MLP.
2) Aggregate (sum up) the messages of all the incoming edges for each node.
3) Update node features and edge features. Each edge’s new feature is the sum of its old feature and the message on the edge. Each node’s new feature is determined by its old feature and the aggregation of messages.
The Decoder
The decoder uses an MLP to decode the accelerations of particles with node embeddings of the processor. It can also be regarded as a postprocessor of the GNN. With the decoded accelerations, an Euler integrator updates velocities and displacements of particles.
Putting Everything Together
Let’s include the encoder, the processor and the decoder together! Before GNN layers, input features are transformed by MLP so that the expressiveness of GNN is improved without increasing GNN layers. After GNN layers, final outputs (accelerations of particles in our case) are extracted from features generated by GNN layers to meet the requirement of the task.
Operation Modes of GNS
The GNS works in two modes: one-step mode and rollout mode. In one-step mode, the GNS always makes predictions with ground-truth inputs. In rollout mode, the GNS predicts positions of particles in the next step based on its own predictions in the previous step. As a result, errors accumulate over time for rollout mode.
Key feature of the Model
Inductive Biases for Spatial Invariance
The physics interactions between particles are invariant to their spatial positions. To be consistent with this invariance, the encoder masks out absolute position information of particles and instead encodes the relative displacement between particles into features. This empowers the model to have strong inductive biases for spatial invariance.
Robustness
In simulations of complex physics of chaotic systems, the accumulation of errors over time frames could make huge negative impacts to the results. When training the model, researchers corrupt the input ground-truth data with random walk noise to represent errors generated by previous rollouts. This makes the model robust enough for such error accumulation.
Different Materials, One Model
The physical attributes of different materials are diverse. Most simulators are only applicable to one type of material[1]. GNS proposes a general approach to simulate the physics of different materials. It can produce accurate predictions for fluid, rigid and deformable material.
Results
Here is the result after training the model on the entire WaterDrop dataset for 5 epochs, which takes about 14 hours with a GeForce RTX 3080 Ti.
We plot the evolution loss and metric over the training process. The curve of loss and the curve of one-step MSE show that the model converges at the end of the training. But the curve of rollout MSE metric shows that the model is not so stable over a long time. One explanation is that the simulated system is chaotic in nature, such that one small numerical error will lead the system to behave totally differently.
Finally, we measure the metric on the test set. Our reproduction reaches a one-step MSE of 3.04e-9 and a rollout MSE of 1.42e-2, which are reasonable compared to the original paper as we have less computation power.
Google Colab
We made a Colab Notebook with all the code for this tutorial and a cropped version of the WaterDrop dataset. Here is the link. Feel free to make a copy to your own drive and play with it!
Conclusion
In this tutorial, we walked through a step-by-step guide on applying GNN in a real-world application such as simulating complex physics! If you’re interested in the technical details, please read the original paper by DeepMind. Thanks for spending the time with us!
Reference
[1] A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, and P. W. Battaglia. Learning to simulate complex physics with graph networks. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 8459–8468. PMLR, 2020. URL http://proceedings.mlr.press/v119/sanchez-gonzalez20a.html.
[2] P. W. Battaglia, R. Pascanu, M. Lai, D. J. Rezende, and K. Kavukcuoglu. Interaction networks for learning about objects, relations and physics. In D. D. Lee, M. Sugiyama, U. von Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5–10, 2016, Barcelona, Spain, pages 4502–4510, 2016. URL https://proceedings.neurips.cc/paper/2016/hash/3147da8ab4a0437c15ef51a5cc7f2dc4-Abstract.html.
[3] Y. Li, J. Wu, R. Tedrake, J. B. Tenenbaum, and A. Torralba. Learning particle dynamics for manipulating rigid bodies, deformable objects, and fluids. In ICLR, 2019.