Physics-based simulation via backpropagation on energy functions

Junior Rojas
The Startup
Published in
6 min readMay 10, 2020


These days, most neural network frameworks are essentially collections of functions on multidimensional arrays with added support for a particular type of automatic differentiation (backpropagation).

These features are generally useful for any task that requires numerical optimization (deep learning happens to be one of them) and, in this article, I want to show how these features can be particularly useful for physics-based simulation. I believe that being able to implement physics engines using deep learning frameworks can be very convenient to prototype new environments in the context of reinforcement learning. That is how the physics engine for this project I presented at NeurIPS 2019 was implemented.

This engine was implemented using the finite element method and implicit numerical integration in PyTorch. There aren’t many implementation details about the engine in the paper, so in this article I want to explain the main ideas behind it and why not only was backpropagation useful to train the neural network controllers, but also to implement the physics engine itself.

The key idea is that we can use differentiable energy functions to define state transitions. This also has some interesting connections to what Yann LeCun refers to as energy-based models in self-supervised learning and could address some inefficiencies of current supervised learning and model-free reinforcement learning algorithms (there’s a good talk online about it).

Energy-based Approaches to Representation Learning — Yann LeCun

Funnily enough, my physics engine was implemented using energy functions, but all the locomotion controllers were optimized using model-free RL. There might be some opportunities to use some of the energy-based modeling ideas for training in the future, and understanding how energy functions fit into physics-based simulation and deep learning might provide some guidance for future work.

Physics-based energy functions as loss functions

Many physics-based state transitions can be understood as energy minimization procedures. This is similar to the (somewhat simplified) notion that learning can be understood as minimizing a loss function.

Mass-spring systems are one of the simplest examples I can think of to demonstrate how this works. The parameters that we can optimize for are the vertex positions of the system and our goal is to find some configuration that minimizes the total elastic energy. Every spring has some preferred rest length and they naturally tend to recover their rest shapes over time.

Mass-spring system

We can define the energy of a spring using Hooke’s law (this is not the only option, but it’s probably the most common). Considering a spring connecting two vertices with positions a and b, and rest length l0, its elastic energy E is defined by:

Hooke’s law

Computation graph

We can compute the total elastic energy using tensor operations and then minimize it via backpropagation, using Adam for example (PyTorch implementation available). The resulting computation graph ends up looking like a shallow neural network followed by a quadratic loss. A linear function maps x (vertex positions) to d (vectors connecting pairs of vertices), a non-linear function maps d to l (spring lengths) and the energy E measures the discrepancy between l and l0 (spring rest lengths). For simplicity, k is omitted from the computation graph, but it should also be an input of E.

Spring energy as a computation graph

Note that in my PyTorch implementation I used a dense matrix for the linear layer (x to d) for simplicity, but there are opportunities to optimize this with sparse operations. Although sparse matrices are available in some deep learning frameworks, support for them is limited in general. Something interesting to note is that if the topology of the mesh were a regular grid, the linear layer could be implemented using convolutions, which is a particular type of sparsity that is well supported in most deep learning frameworks.

You might have seen physics update rules before expressed in terms of forces instead of energy minimization. Defining update rules in terms of forces is also an option, just remember that the force is the negative of the derivative of the potential energy and we are running backpropagation on the energy to automatically compute the force.

Minimizing the elastic potential energy is not the whole story, however. The animation shows multiple Adam iterations, which produces a reasonable simulation, probably because Adam includes momentum in its implementation. However, there are more principled ways to introduce momentum in the simulation, also in terms of minimization, but I’ll leave that for another article.

Optimization methods for physics-based simulation

It would be interesting to explore the possibility of implementing custom optimizers specifically designed for simulation. The mass-spring system animations I showed here were implemented using Adam, but for the full implementation of my engine I made a custom optimizer that was more appropriate than the built-in PyTorch optimizers at least for simulation purposes (for neural network training I did use Adam). However, I consider that this custom optimizer is still too simple compared to what physics engines normally use; it’s usable, but it lacks many features that would be ideal to have in an optimizer for a physics engine and I hope this will be improved in the future.

I am most familiar with simulation methods developed for computer graphics and physics-based animation, where second-order optimization methods are often preferred over first-order methods. This is in contrast to common practice in deep learning which tends to favor first-order methods such as SGD and Adam. In computer animation, Newton and quasi-Newton methods are more common, but in my experience they are harder to implement, especially without automatic differentiation.

Regardless of the optimization method used, I think that expressing the loss or energy we want to minimize as a composition of differentiable operations is a good first step. Having this implemented in a system with automatic differentiation features could be helpful to prototype new optimization methods.

Next steps

There is a lot of great work going on outside of deep learning developing tools, libraries and engines for physics-based simulation. Although it is very likely that we won’t be able to achieve the same level of performance as highly optimized code present in other engines at the moment, I am interested in seeing how far we can push deep learning frameworks (new features and extensions could be helpful) to implement physics. Considering that physics is an important component of many RL environments, it would be very convenient if such environments could be easily implemented in the same framework, especially when prototyping new ideas.

I presented mass-spring systems as energy-based models in this article to illustrate how one would use backpropagation to implement physics, but this is just a very simple example, we can do much more. I have not talked about how to properly incorporate mass or inertial effects in the simulation, I only mentioned particle positions, but I never mentioned particle velocities. We can also implement collisions, gravity, friction and elasticity models based on the finite element method. For now, I can say that implementing all these additional features boils down to defining new energy functions, but the simulation will still look a lot like an energy minimization loop where backpropagation allows us to automatically compute forces as illustrated in the mass-spring system example.