Adding Objects to Image in Python

Guide on how to overlay a small image over a big image using Python, OpenCV and Numpy

Alex P
7 min readDec 12, 2021

In this article I’ll show how to add object from one image to another image. To do this, we will need:

  • background image, into which the object will be inserted;
  • image of the object;
  • image with mask of the object (mask has a black color, other space is white).

In our case background is a photo of the sea, object is a cup of coffee. Here they are:

Download them: background.jpg, cup.png, cup_mask.png.

Update. You can also check this short video tutorial to see how to create mask of the object with help of Photoshop.

Soon you’ll find out how to obtain the following results:

Let’s start!

Basic concepts of digital images

First, I want to remind you that images are stored as 3D arrays of integer numbers from 0 to 255.

The shape of these arrays is height x width x channels.

There are three channels: Red, Green and Blue. This is called RGB format.

Thus, every image consists of height x width pixels. Each pixel has three components. The numerical values of these components range from 0 to 255 and represent the brightness values of red, green and blue colors. The higher the number, the brighter the color.

For example:

  • pixel [255, 0, 0] has red color, because there is zero brightness in green and blue colors, but red color has a maximum brightness;
  • pixel [255, 255, 255] has white color, which is a mix of red, green and blue colors with maximum brightness;
  • pixel [0, 0, 0] has a black color, because there is no brightness at all due to zero values of all colors.

1. Imports

Now, create a new notebook in Jupyter Notebook. First, we need to import the necessary modules:

2. Reading and diaplaying images with OpenCV

Let’s open images with help of cv2.imread() function and display them.

Note! For some reason, OpenCV reads images in BGR format (Blue and Red colors are swapped). We need to convert BGR to RGB format with help of cv2.cvtColor() function.

Output:

Background shape: (1280, 1920, 3)
Image shape: (860, 1151, 3)
Mask shape: (860, 1151, 3)

We see that background image has height=1280 and width=1920, object image has height=860 and width=1151.

Let’s look at the images:

Output:

Important! To better understand the further manipulations with images & numpy arrays, you can read the article “Replacing Part of 2D Array with Another 2D Array in Numpy”.

3. Removing background from the image of the object

Now we will define a function, which converts mask of the object to boolean array, or boolean mask.

On the original mask, object area is filled with black color, background area is filled with white color.

Boolean mask has the same height and width as the original mask, but only one channel. If a pixel belongs to the object area, its value is True, else — False.

Boolean mask will help us to remove all background pixels.

Output:

Image with removed background shape: (860, 1151, 3)
Boolean mask shape: (860, 1151)

4. Adding object to background image

Before we define the function which adds object to the background image, I need to explain and visualize several cases of images overlapping.

Let’s say, the background image has height h_background and width w_background, the object image has height h and width w.

h should be leass than h_background, and w should be less than w_background.

Case 1). If we place the object into the middle of the background, then everything is simple: the part of the background area of size h x w should be replaced with the object.

Object is placed in the middle of the background. Overlapping area here is h x w

Case 2). If we place the object into the top left corner of the background, then part of the object may be outside of the background area. In this case the part of the background area of size (h - y) x (w - x) should be replaced with the object.

Here -x and -y are the coordinates of the top left corner of the object image. Sign ‘-’ is here because the top left corner of the background image has coordinates x=0 and y=0. Everything which is on the left from the top left corner of the background image has a negative x coordiate, and everything which is higher then the top left corner of the background image has a negative y coordiate.

Object placed into the top left corner of the background. Overlapping area here is (h — y) x (w — x).

Case 3). If we place the object into the bottom left corner of the background, then part of the object may be outside of the background area. In that case the part of the background area of size (h_background - y) x (w - x) should be replaced with the object.

Generally, the area can be calculated as (h - max(0, y + h - h_background)) x (w - x), because if the lowest border of the object image is above the lowest border of the background image, then h x (w - x) area should be replaced with the object.

Object placed into the bottom left corner of the background. Overlapping area here is (h_background — y) x (w — x). General formula for overlapping area here is (h — max(0, y + h — h_background)) x (w — x).
Overlapping area here is h x (w — x). General formula for overlapping area here is (h — max(0, y + h — h_background)) x (w — x).

Case 4). If we place the object into the top right corner of the background, then part of the object may be outside of the background area. In that case the part of the background area of size (h - y) x (w_background - x) should be replaced with the object.

Generally, the area can be calculated as (h - y) x (w - max(0, x + w - w_background)), because if the right border of the object image is on the left from the right border of the background image, then (h - y) x w area should be replaced with the object.

Object placed into the top right corner of the background. Overlapping area here is (h — y) x (w_background — x). General formula for overlapping area here is (h — y) x (w — max(0, x + w — w_background)).
In this case overlapping area is (h — y) x w. General formula for overlapping area here is (h — y) x (w — max(0, x + w — w_background)).

Case 5). If we place the object into the bottom right corner of the background, then part of the object may be outside of the background area. In that case the part of the background area of size (h_background - y) x (w_background - x) should be replaced with the object.

Generally, the area can be calculated as (h - max(0, y + h - h_background)) x (w - max(0, x + w - w_background)), because if the right part of the object image is on the left from the right part of the background image, and if the lowest part of the object image is above the lowest part of the background image, then h x w area should be replaced with the object.

Object placed into the bottom right corner of the background. Overlapping area here is (h_background — y) x (w_background — x). General formula for overlapping area here is (h — max(0, y + h — h_background)) x (w — max(0, x + w — w_background)).
In this case overlapping area here is h x w. General formula for overlapping area here is (h — max(0, y + h — h_background)) x (w — max(0, x + w — w_background)).

Now, taking into account all described above cases, lets define the function:

Besides passing background, object and mask images to the function, we pass coordinates x and y, which define where the center of the object will be placed.

Coordinate (0, 0) is the top left corner of the background.

w_bg and h_bg are the width and height of background.

x and y should meet the following conditions: 0 < x < w_bg and 0 < y < h_bg.

Let’s look how the function works.

Example 1). Let’s place the cup in the center of background. The width of background is 1920, the height is 1280, so coordinates of the object’s center are x=1920/2=960 and y=1280/2=640.

Output:

Example 2). Let’s place the cup in the bottom left of background. This time the coordinates of the object’s center are x=200 and y=1100.

Output:

Example 3). Let’s place the cup in the bottom right of background. This time the coordinates of the object’s center are x=1800 and y=1100.

Output:

Example 4). Let’s place the cup in the top left of background. This time the coordinates of the object’s center are x=200 and y=200.

Output:

Example 5). Let’s place the cup in the top right of background. This time the coordinates of the object’s center are x=1800 and y=200.

Output:

--

--

Alex P

Machine learning engineer, computer vision enthusiast