Some Thoughts about “Deep Unsupervised Learning using Nonequilibrium Thermodynamics”

From Jascha Sohl-Dickstein, et. al. “Deep Unsupervised Learning using Nonequilibrium Thermodynamics”

I have downloaded this paper “Deep Unsupervised Learning using Nonequilibrium Thermodynamics” for while but only read it through today. The typical example of direct connection between neural network and statistical physics is some spin-magnetic system like Ising model. This paper is interesting in the sense that it connected deep learning to a different area of statistical physics, e.g. non-equilibrium diffusion process.

Carey Nachenberg’s talk on the relation between Ising model and deep learning

One can see the connection between Ising model and deep learning from this nice introduction video by Carey Nachenberg “Stanford Seminar — Deep Learning for Dummies”. Each of the little “magnets”, which physicists call “spin”, has two states. These becomes a neuron in deep learning neural network. In a neural network, the interaction between the little magnet is not given by the underlying quantum mechanics, as in physics, but one needs to design a process to learn the interactions between the neurons to reconstruct the training patterns.

In this paper by Jascha, et al., the authors consider a different physical process “diffusion” than the magnetic systems for constructing the “learning process”. Rather than modeling the join probabilities by an energy function with a Boltzmann distribution, the authors build a deep network to learn the “diffusion process” from a known distribution to a simpler final distribution. Then they work out a formulation to reverse such process such that one can recover the original distribution from a random distribution with a “time reversal” process.

In contrast to other “equilibrium” statistical mechanics approaches, we are not looking for a distribution at “equilibrium” (Strictly speaking, a Boltzmann machine does encode a “quenched disorder” which are not at an equilibrium state with the other parts of the system). The original distribution is recovered through reversing the diffusion process without any assumption that the system is under a certain equilibrium state.

I think the idea is quite fascinating and original. Such a connection between machine learning and non-equilibrium statistical mechanics is certainly worth exploring more. I also believe that many ideas that have been brought up in the context of non-equilibrium statistical mechanics will probably be useful for new machine learning techniques. For example, I think there is some connection between a non-equilibrium driven processes connecting quench disorder system (e.g. spin glass model). Shameless self-promotion here: I wrote a paper on about an anomalous diffusion process that couples a diffusion process to non-equilibrium growth phenomena: Passive random walkers and riverlike networks on growing surfaces. One related paper shows such driven system exhibiting “replica symmetry breaking”: “Replica Symmetry Breaking in Trajectories of a Driven Brownian Particle”. As “replica symmetry breaking” is a signature of spin glass system for “learning different patterns”, whether such driven diffusion system will be useful for machine learning might be an interesting question to ask. Hopefully we will see more related work coming out soon.

Show your support

Clapping shows how much you appreciated Jason Chin’s story.