Statistical physics of diffusion models

Yiğithan Gediz
3 min readMay 26, 2024

--

As the term project of PHYS 506: Quantum statistical mechanics course, I decided to focus on the interplay between statistical physics and diffusion generative models. Specifically, score matching and how it can be interpreted in the context of spontaneous symmetry breaking.

I came across a wonderful paper by Gabirel Raya and Luca Ambrogioni, that investigates how symmetry affects the generation process which made me interested about this topic:

In this article, I’ll just publish my simulations that I made for my term project

First, I implemented a simple diffusion process with constant variance for the forward propagation. Initially, particles are at x = 2 and x = -2, then start to diffuse into a uniform distribution.

We can define the reverse process by calculating the drift term to recover the initial distribution.

What Raya and Ambrogioni did is to study this process as a gradient descent from some potential by calculating the time dependent potential energy function of the inverse process. In this simple setting, particles are diffusing inside a potential gradient. The Wiener process term in the SDE acts as if there is a nonzero temperature that causes thermal fluctuations.

Observe that particles are perturbed from their unstable equilibrium positions with thermal fluctuations and potential gradient monotonically directs them into two different stable equilibria. Hence, this process do not contain a spontaneous symmetry breaking.

Raya and Ambrogioni showed that, for a different process, we can observe a symmetry breaking where particles are first trapped in a stable equilibrium near origin and then they are forced to choose a side when the stable equilibrium turns into an unstable point.

We observe that before the critical time, a usual mean reverting process is happening. Only after that, samples from x = +1 and x = -1 is generated. This phenomena is the basics of what Raya and Ambrogioni observed in the realistic image datasets. They showed that after training the same Variance-Preserving SDE on these datasets and investigate how starting the generative process later than t=0 affects the sample quality, FID score remain constant for a while and starts increasing sharply after a certain amount time. This suggests that starting point of the process before the symmetry breaking do not really matter to the final result as most of the entropy is reduced during the magnetized phase.

To read the whole paper I wrote for my term project:

https://drive.google.com/file/d/1Yi1UpIJs3simW09MXYvdIqisTtshIISS/view?usp=sharing

--

--