Removing Any Object From Your Photo with LaMa

Andrew

Published in

GliaCloud

7 min readDec 13, 2021

Recently this year, Samsung Research published a paper on high-resolution image inpainting:

LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions

What is Inpainting?
“It’s the process of reconstructing missing parts of an image so that observers are unable to tell that these regions have undergone restoration, often used to remove unwanted objects from an image or to restore damaged portions of old photos.”
Source: Image Inpainting: Humans vs. AI by Mikhail Erofeev

What’s Special?

LaMa isn’t the first-ever paper on image inpainting. Some previous works include CGPR (Hukkelas et al., 2020), Mask-Aware Dynamic Filtering (MADF) (Zhu et al., 2021), AOT-GAN (Zeng et al., 2021), and many more.

So, what’s special about LaMa?

Easy and Fast

LaMa is fast! Saves your time searching for “photoshop inpainting tutorial” on Google.

We tested the inference time of LaMa with HD images, with 7–10% of inpainted areas on average (max 20%), repeated each experiment 5 times, and reported the average.

Given an image, we were able to remove the painted object in 2 seconds with a GPU.

Experiment Results on LaMa’s Inference Time

What if I have only CPUs?
Don’t worry. LaMa also works on pure CPU environments. It took around 25 seconds to inpaint an image with our hardware.

Here are the hardware specs for testing:

GPU： NVIDIA Tesla V100 SXM2 single core
GPU Memory: 30 GB
Memory：60 GB
CPU：Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
Num. of CPU：36

Works on High-Quality Photos

In the paper, the authors tested the model on photos of 640 * 512 and 1920 * 1536, and both showed impressive inpainting results.

We tested the model even further with images of 1920 *1280 (HD), 2048 *1080 (2K), and 3840 * 2160 (4K). The results are for HD and 2K are good as the authors claimed.

Note that LaMa is trained only on images of 256 * 256. This indicates the high generalizability of the model on images of different resolutions.

However, for 4K images, we have yet successfully generated outputs from the model as the Google Colab used for testing would crash.

Open Source

Another plus about LaMa is that its code is entirely open-sourced! It’s released under Apache License 2.0, meaning you’re free to modify, distribute, use it commercially as long as you abide by the terms.

GitHub — saic-mdal/lama: 🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with…

Official implementation by Samsung Research by Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova…

github.com

Third-party Application

Provided the Apache License, many have also developed their own applications or demos of LaMa:

Cleanup Picture by Cyril Diagne
lama-cleaner by Sanster on GitHub
Integration to Huggingface Spaces with Gradio by AK391 on GitHub
Telegram bot MagicEraserBot (code) by Moldoteck on GitHub

You too can easily build an application with LaMa.

How’s the Quality?

Pros

Best working scenario: small-area inpainting where the colors of the surroundings are relatively consistently with no texts
Performs well irrespective of the hue, saturation, and brightness of the surroundings.
Works better when the inpainted object is surrounded by a single pattern: single color, linear color gradient, parallel lines, grids, etc.
Even if the backgrounds are of different colors on different sides of the inpainted object, LaMa is able to smoothly bridge the color difference with gradients.

Cons

When painting an object for removal, you can’t just paint exactly up until the edges of the object. Parts of the background need to be included in order to yield good inpainting results.
Refrain from having an inpainted area with holes of uncovered sub-areas. The colors and patterns of these sub-areas could interfere with those of the background as the model fills the inpainted area.

Case Studies

Here are a few case studies on the quality of LaMa’s outputs. For each case, from the left to right, we have the (1) original image, (1) the image with the target object painted, and (3) the resulting image with the target object removed.

Case 1–1: Removing a couple at dusk

We didn’t entirely cover the shadows of the couple with our paint, so LaMa not only kept but extended them as part of the inpainting process.
Similarly, the pedestrian in the grey down jacket wasn’t entirely covered, so it was also extended to fill the inpainted area.
The lights near the ground were slightly distorted but it’s indiscernible if we view the photo at its original scale.

Case 1–2: Removing a couple at dusk (w/ shadows covered)

We painted over the shadows entirely, so LaMa removes them along with the couple, leaving little to no traces.
In order to cover the shadows, a larger proportion of the bench was also included, causing slight distortions to the surface. On the other hand, the lights close to the ground were preserved much better than in the previous test case.

Case 2–1: Removing a man in a group photo

The background was too complex (with texts of high contrast color) for LaMa to perform inpainting.
A large proportion of the inpainted area was filled with grey, the color of the suit worn by the man removed, likely because we only painted up to the edges of the suit instead of stretching it out a little more. While the high contrast between the grey of the removed object and the black of the surroundings could also be a factor, we have seen LaMa perform quite well in such a scenario through Case 1–1 and Case 1–2.

Case 2–2: Removing a man in a group photo (w/ more background included)

For the 1st try, we attempted to include more background around the man in the grey suit with our paint. However, the holes retained some of the text on the backdrop.
The 2nd try was more successful with minimal incomplete texts retained and the colors were filled rather coherently.

Case 3: Removing a man to keep just the scene in the background

The tower crane in the background was recovered fairly well. LaMa was also able to capture the window patterns of the buildings, despite some shadows.
The repair is indiscernible if we view the photo at its original scale.

Find Out More

If you’d like to learn more about LaMa, make sure to check out these resources:

Reference

Hakon Hukkelas, Frank Lindseth, and Rudolf Mester. 2020. Image Inpainting with Learnable Feature Imputation. CoRR abs/2011.01077, (2020). Retrieved from https://arxiv.org/abs/2011.01077
Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding, and Zhaoxiang Zhang. 2021. Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness. CoRR abs/2104.13743, (2021). Retrieved from https://arxiv.org/abs/2104.13743
Roman Suvorov, Elizaveta Logacheva, Anton Mashikhin, Anastasia Remizova, Arsenii Ashukha, Aleksei Silvestrov, Naejin Kong, Harshith Goka, Kiwoong Park, and Victor Lempitsky. 2021. Resolution-robust Large Mask Inpainting with Fourier Convolutions. arXiv preprint arXiv:2109.07161 (2021).
Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo. 2021. Aggregated Contextual Transformations for High-Resolution Image Inpainting. CoRR abs/2104.01431, (2021). Retrieved from https://arxiv.org/abs/2104.01431