How to Use Differential Diffusion for Better Inpainting in ComfyUI

Prompting Pixels
5 min readJun 24, 2024

--

After being asleep at the wheel for a couple of months, differential diffusion was introduced.

After playing with it for a couple of days — I am still blown away at the results.

This new state-of-the-art method allows for changes on a per pixel basis rather than an entire region.

Another way to put this in perhaps more layman’s terms, or at least this is the way I understand it — is that instead of masking out an area to make changes when inpainting, differential diffusion allows you to apply a mask as a gradient instead, with the light areas of the mask more likely to change and the darker areas less likely to change. This results in cleaner and better end results without any jagged edges.

To quickly demonstrate the difference.

Here’s a side-by-side demonstration of what an inpainting mask looked like before (left) and what it’ll look like after using differential diffusion (right).

Old method on the left with a rigid black and white mask. New method introduces a soft gradient mask for better results.

While this doesn’t necessarily look groundbreaking. Trust me on this — the results are super good.

Demonstrating the Process

As always, if you want to check out how this all works, check out the Prompting Pixels website where I the workflow available to download.

Don’t worry; I’ll still be going over much of the same information in this article, but I think that page may give you a little better understanding.

There’s also a video on YouTube as well:

The Nodes

The workflow to set this up in ComfyUI is surprisingly simple. You’ll just need to incorporate three nodes minimum:

  • Gaussian Blur Mask
  • Differential Diffusion
  • Inpaint Model Conditioning
  • Convert Mask to Image (optional — helpful to review the mask)

The Workflow

In this short demonstration, we’ll be using a standard inpainting workflow. This means an image will be provided as an input, and then we’ll modify it through mask editing and prompting.

So for the first half of the workflow:

We’ll set up our Load Image node where I'll be putting in the original image I want to adapt. In this case, I am using this cropped image from Unsplash.

Then, by right-clicking on the image, I’ll open up the Mask Editor to select the area I want to modify:

In this example, my goal is to have her wearing glasses. So, with that in mind, I’ll mask out the appropriate area.

Once masked, you’ll put the Mask output from the Load Image node into the Gaussian Blur Mask node. This node applies a gradient to the selected mask.

Think of the kernel_size as effectively the size of the brush (larger value = bigger brush), whereas the sigma is the strength of the softness (larger value = softer gradient).

The workflow should look like this:

Pro Tip: The softer the gradient, the more of the surrounding area may change. So, don’t soften it too much if you want to retain the style of surrounding objects (i.e. in this example it would be the nose, mouth, etc.).

The checkpoint you’ll just use a standard generational checkpoint — not an inpainting checkpoint (more on this in a minute).

As for your prompt, it’s just like regular inpainting. The prompt should be both what your image currently is and what you want it to be. So, instead of just a woman, it becomes a woman wearing glasses.

Second half of the workflow:

The second half is where things get interesting. Unlike traditional inpainting that you must download a separate inpainting checkpoint, differential diffusion works well with standard generational checkpoints (can still provide good results with an inpainting checkpoint as well).

Here’s what the second half of the workflow looks like:

However, you’ll need to condition the model by passing your model through the Differential Diffusion node.

Additionally, your prompts, mask, etc. must also be passed through the Inpaint Model Conditioning node.

These are native to ComfyUI.

Pro Tip: If you don't see them, make sure you update your application accordingly in order to load these nodes.

Lastly, for the KSampler, you will just set the values that work best for your checkpoint.

The one notable change you'll want to look at here is the denoise value in the KSampler node. Essentially, denoise is how much noise is added to the masked region. So, if denoise = 1, then the model will draw completely new results and have no knowledge of what is under the pure white areas of the mask. Whereas a denoise value of 0, and the image won't change.

Typically, somewhere in the range of .6 - .8 is good for inpainting, as the final result will change depending on your prompt while still having some knowledge of the information underneath the mask.

Once all set, just hit that Queue Prompt button and review the results.

Here’s a look at our before and after:

Pretty cool and incredibly consistent between the cheeks and glasses — you’d never know this was effectively two different images. Notice that subtle details like her eyebrows and nosering changed slightly — this could be reduced by being more selective on the mask and reducing the sigma value in the Guassian Blur Mask node.

However, admittedly, the top line of the glasses where the eyelashes meet is a little strange-looking, but nothing that couldn’t be cleaned up after an iteration or two.

So, there you have it, a cool new way to do better inpainting.

Want to learn more about diffusion models? Check out the Prompting Pixels website where we have in-depth tutorials and videos.

--

--

Prompting Pixels

Official account for Prompting Pixels (YT Channel & Website)