Part 2 : SciML — A Mathematical account of PDE Solvers, Discoverers and Operator Learning

Published in

Autonomous Agents

6 min readJul 13, 2024

The integration of machine learning with scientific modeling, known as Scientific Machine Learning (SciML), has ushered in transformative techniques for solving and discovering differential equations and operators. These techniques can be broadly categorized into three main areas: PDE solvers, PDE discovery, and operator learning. In this article, I aim to provide comprehensive analysis of these techniques, targeted at an audience with knowledge in mathematics.

Series

Ref: https://arxiv.org/pdf/2312.14688

1. PDE Solvers

PDE solvers aim to approximate the solutions of known partial differential equations (PDEs) using neural networks. This approach involves embedding the PDEs within the architecture of neural networks to enforce physical laws directly in the learning process.

Physics-Informed Neural Networks (PINNs)
PINNs, as introduced by Raissi et al. (2018), represent a paradigm shift in solving PDEs by incorporating the governing physical laws directly into the loss function of the neural network. The core idea is to minimize the residual of the PDE along with any boundary and initial conditions, thereby ensuring that the network’s output adheres to the underlying physical principles. Mathematically, for a PDE of the form L[u]=f is a differential operator, PINNs solve:

where uθ(x) is the neural network approximation with parameters θ. The loss function L(θ) typically includes terms for the PDE residual, boundary conditions, and initial conditions:

Recent advancements in PINNs include adaptive activation functions (Lu et al., 2021), multi-fidelity approaches (Penwarden et al., 2023), and applications to high-dimensional fluid dynamics problems (2021).

Deep Galerkin Method
The Deep Galerkin method (Sirignano and Spiliopoulos, 2018) approximates the solution of PDEs by minimizing the PDE residual over a set of collocation points using neural networks. This method is particularly effective for high-dimensional PDEs and can be formulated as:

where {xi}i=1..N are collocation points.

Deep Ritz Method
The Deep Ritz method (E and Yu, 2018) reformulates PDEs as variational problems and uses neural networks to minimize the associated energy functional. For an elliptic PDE, the method involves solving:

This approach leverages the Ritz method’s strengths in handling complex geometries and boundary conditions.

2. PDE Discovery

PDE discovery focuses on identifying the form and coefficients of a PDE from observed data, enabling the extraction of governing equations from experimental or simulation data.

Sparse Identification of Nonlinear Dynamical Systems (SINDy)
SINDy, introduced by Brunton et al. (2016), utilizes sparse regression to identify the governing equations of dynamical systems. The method represents the dynamics as a sparse linear combination of candidate functions:

where Θ(X) is a library of candidate functions and Ξ is a sparse matrix of coefficients. The optimization problem is:

Later, SINDy was extended to handle noisy and incomplete data, enhancing its robustness for real-world applications.

Symbolic Regression Techniques
Symbolic regression techniques aim to discover mathematical expressions that describe observed data. Notable methods include AI Feynman and genetic algorithms.

AI Feynman: Udrescu and Tegmark (2020) combined neural networks and symbolic regression to uncover physics equations. The method iteratively refines candidate expressions by leveraging the simplicity and interpretability of symbolic forms.
Genetic Algorithms: Genetic algorithms (Schmidt and Lipson, 2009) evolve mathematical expressions through a process akin to natural selection, optimizing for both accuracy and complexity. This approach is flexible and can adapt to various types of data.

3. Operator Learning

Operator learning focuses on approximating unknown operators that map input functions to output functions, which is crucial for complex systems where the operator is not explicitly known.

Operator Learning thrives on multi-fidelity dataset for learning. (do not get confused with the “lots of data” with high-fidelity data). This is a prime candidate for low fidelity data combined with some high-fidelity data giving rise to synth data and distilled data to be added back to the datasets.

Mathematical Formulation
Given pairs of data (f,u), where f∈U and u∈V are functions over a spatial domain Ω⊂Rd, and an operator A:U→V such that A(f)=u, the goal is to approximate A with a neural network A^:

The objective is to ensure that A^ generalizes well to unseen data.

Key Contributions

Deep Operator Network (DeepONet): Developed by Lu et al. (2021a), DeepONet learns the mapping from input functions to output functions. It consists of two neural networks: the branch net, which processes the input function, and the trunk net, which processes the spatial coordinates. The output is a combination of these two networks, effectively learning the operator.
Fourier Neural Operator (FNO): Introduced by Kovachki et al. (2020), FNO utilizes the Fourier transform to learn operators in high-dimensional spaces. By transforming the input function to the frequency domain, FNO captures global information, making it efficient and scalable. The method can be formulated as:

where F and F^−1 are the Fourier and inverse Fourier transforms, respectively, W is a learnable weight matrix, and σ is a nonlinear activation function.

Deep Dive into Mathematical Foundations

The mathematical underpinnings of these techniques are deeply rooted in functional analysis, optimization, and numerical methods.

Functional Analysis
In the context of PDE solvers and operator learning, function spaces such as Sobolev spaces Hk(Ω) play a critical role. For example, the variational formulation of the deep Ritz method relies on the properties of Sobolev spaces to define the energy functional and ensure the existence and uniqueness of solutions.

Optimization
Optimization techniques are central to training neural networks in SciML. The loss functions for PINNs, deep Galerkin, and deep Ritz methods involve minimizing the residuals of PDEs, which can be framed as constrained optimization problems. Advanced optimization algorithms, such as stochastic gradient descent (SGD) and Adam, are employed to navigate the high-dimensional parameter space of neural networks.

Numerical Methods
Numerical methods, such as finite element methods (FEM) and spectral methods, provide the theoretical foundation for neural PDE solvers. For instance, the deep Galerkin method draws parallels to the collocation methods in FEM, while FNO leverages spectral methods through the Fourier transform.

Future Directions and Open Questions

The integration of SciML techniques with traditional scientific modeling opens numerous avenues for future research:

Scalability and Efficiency: Developing more efficient neural architectures and training algorithms to handle high-dimensional and large-scale problems.
Robustness and Uncertainty Quantification: Incorporating techniques for uncertainty quantification and robustness to noisy and incomplete data.
Interdisciplinary Applications: Expanding the application of SciML techniques to diverse fields such as genomics, climate modeling, and quantum mechanics.
Theoretical Insights: Gaining deeper theoretical insights into the convergence, stability, and generalization properties of SciML methods.

Open Discussion

The advancements in SciML techniques represent a confluence of machine learning and traditional scientific modeling, offering powerful tools for solving and discovering complex systems. As researchers in mathematics, physics, and biology, your insights and contributions are crucial in shaping the future of this interdisciplinary field. What are your thoughts on the current state of SciML, and what challenges and opportunities do you foresee in your respective domains?