Adjusting Image Orientation with Perspective Transformations Using OpenCV

siromer
The Deep Hub
Published in
4 min readJun 9, 2024

Aligning Images with OpenCV

When dealing with computer vision tasks, most of the time images might not be exactly ready to process. Imagine you want to extract text from pages, but the images of the pages have different angles; some are horizontal, and some have varying rotations. In a situation like this, you can not efficiently process your images.

  • You need to adjust your image orientation, and by using Perspective Transformations, you can create a more suitable image.
  • About a few weeks ago, I created a simple program that extracts text from pages (link). Most of the time, images are not perfectly aligned, so they cannot be used directly. In this article, I will show you how to align images perfectly by using Perspective Transformations.
Perspective Transformation of Chess Board
  • I will use Python as a programming language

Example Usage of Perspective Transformations: Chess Board

Nowadays, I am working on a chess project (if you are interested : article1 , article2 , article3) , and I need to extract chess pieces and their square numbers (like a1,f4). I used different approaches for extracting locations of squares , and one of them is Perspective Transormations. Just look at the image that is on the left side, it is very hard to detect location of squares accurately without processing this image.

  • By applying Perspective Transformations to the image, I obtained the right side. After applying perspective transformations, the position of squares can be efficiently extracted.

How to use Perspective Transformations with OpenCV?

By using OpenCV, applying perspective transformation to a part of an image is relatively easy. You just need to find out coordinates of that part.

  • You need to choose 4 coordinates, and you can write a simple GUI for choosing coordinates with your mouse. I created a similar GUI, you can check out ( link1: here I draw a rectangle on an image, you can easily print the coordinates of that rectangle and use those points). Alternatively, you can choose your points by looking at the image and trying to find the best points for you.

There are 2 functions for using Perspective Transformations in OpenCV :

  • cv2.getPerspectiveTransform: This function computes the perspective transformation matrix M.
  • cv2.warpPerspective: This function applies the perspective transformation matrix M to an image.

For parameters and more information, you can check the official documentation.

CODE | Aligning book pages

  • Step 1 : Read the image
image = cv2.imread(r"C:\Users\sirom\Downloads\Perspective-Chess\1.jpeg")
rgb_image=cv2.cvtColor(image,cv2.COLOR_BGR2RGB)

plt.imshow(rgb_image)
  • Step 2: Find four points that surround part of the image. Look above, I chose 4 points that surround my target area.

pt1=[520,300]
pt2=[300,1100]
pt3= [1100,440]
pt4=[870,1250]

IMPORTANT NOTE : If you want to do exact same thing as me, you need to follow the same order of points (look image in the above and follow same order).

  • Step 3: Find Max Height and Max Width for part of the interested area
# calculating the distance between points ( Pythagorean theorem ) 

height_1 = np.sqrt(((pt1[0] - pt2[0]) ** 2) + ((pt1[1] - pt2[1]) ** 2))
height_2 = np.sqrt(((pt3[0] - pt4[0]) ** 2) + ((pt3[1] - pt4[1]) ** 2))

width_1 = np.sqrt(((pt1[0] - pt3[0]) ** 2) + ((pt1[1] - pt3[1]) ** 2))
width_2 = np.sqrt(((pt2[0] - pt4[0]) ** 2) + ((pt2[1] - pt4[1]) ** 2))

max_height=max(int(height_1), int(height_2))
max_width = max(int(width_1), int(width_2))

print(max_height,max_width) # --> 842 596 in my case
  • Step 4: Use perspective transformation functions

→ cv2.getPerspectiveTransform(src, dst)

→ cv2.warpPerspective(src, dst, dsize)

# four input point 
input_pts=np.float32([pt1,pt2,pt3,pt4])

# output points for new transformed image
output_pts = np.float32([[0, 0],
[0, max_width],
[max_height , 0],
[max_height , max_width]])


# Compute the perspective transform M
M = cv2.getPerspectiveTransform(input_pts,output_pts)

out = cv2.warpPerspective(rgb_image,M,(max_height, max_width),flags=cv2.INTER_LINEAR)

plt.imshow(out)
Output

Look at the image; it doesn’t have an angle to the right or left; it is ready to use for different applications.

Result

--

--