3D生成：DreamFusion Part 3: Score Distillation Sampling

圖學玩家

3 min readDec 27, 2023

<圖學玩家第032篇原創文>

SCORE DISTILLATION SAMPLING (SDS)

Differentiable Image Parameterizations (DIP)

DreamFusion的Diffusion Process，並不是從RGB Space去Sample圖片，而是利用DIP 。

As long as an image parameterization is differentiable, we can backpropagate ( **<--** ) through it.

簡單來說，DIP就是用其他可微分的參數來表示一張圖片。我們用θ來表示3D Volume的參數，那麼Sample到的圖片就可以用x = g(θ)來表示，其中g為一Transformation Function，將θ參數轉化為圖片x。

We are not interested in sampling pixels; we instead want to create 3D models that look like good images when rendered from random angles.

Score-based Generative Modeling

我們在Part 2有提到Score Function，這邊稍微再展開介紹一下Score-based Generative Model。

讀者可以理解Diffusion Model為一種Score-based Generative Model

首先先介紹一下EBM (Energy-based Model)

引用自物理力學，fθ(x)是靈活且可參數化的Energy Function，也因此可以用Neural Network去Modeling。而Zθ稱為Normalizing Constant，確保∫pθ(x)dx = 1。透過對EBM公式進行以下操作:

換句話說，我們用來Model fθ(x)的NN，其實就是Part 2提到的Score Function。這個NN可以透過最小化Fisher Divergence去優化:

對Data(圖片) Space中x的Log Likelihood取Gradient，其實就是找到在Data Space中，可以增加p(x) Likelihood的方向。

For every x, taking the gradient of its log likelihood with respect to x essentially describes what direction in data space to move in order to further increase its likelihood.

視覺化Score Function如下圖所示，水平面即x所在的Data Space，縱軸的山峰(亦可解釋為密度較高的區域)可視為Model最終收斂的幾種可能模式(Mode)。像這樣透過Socre Function表示p(x)的分佈並用MCMC去產生Sample，就是所謂的Score-based Generative Modeling。

Collectively, learning to represent a distribution as a score function and using it to generate samples through Markov Chain Monte Carlo techniques, such as Langevin dynamics, is known as Score-based Generative Modeling

SDS

DreamFusion中定義Diffusion Model Training的Loss如下 (w(t)為Weighting Function，其餘與Part 1 Training中的Gradient Descent一致)

將上式中的圖片x，轉成DIP x = g(θ)，我們的目的就變成最小化以下Loss:

然而實驗結果證實這樣做的效果並不好。為何如此? 我們首先對Loss取Gradient可以得到:

z(t) is the added noise.

DreamFusion中提到，U-Net Jacobian這個部分需要巨大的計算資源，並且在雜訊小的情況下效果不佳，因此DreamFusion直接忽略掉這部分，並將剩下的部分定義為SDS Loss的Gradient:

從Score-based Generative Model的角度來看，這其實就是去找到Data Space中的某個可能模式(Mode)，也就是密度較高的區域。DreamFusion將這個方法稱為Score Distillation Sampling (SDS)。

this loss perturbs x with a random amount of noise corresponding to the timestep t, and estimates an update direction that follows the score function of the diffusion model to move to a higher density region

如果喜歡筆者的文章分享，可以幫忙追蹤圖學玩家，你們的閱讀與追蹤是筆者繼續為大家分享的動力~

3D生成：DreamFusion Part 3: Score Distillation Sampling

SCORE DISTILLATION SAMPLING (SDS)

Differentiable Image Parameterizations (DIP)

Score-based Generative Modeling

SDS

系列文章

Ref

Written by 圖學玩家