Getting Started With Image Processing in OpenCV and Python

Maham Arif
Udacity Intel Edge AI Scholars
3 min readJan 14, 2020

OpenCV (Open Source Computer Vision Library) is the most widely used library in image processing and computer vision applications. It contains implementations for around 2500 comprehensive algorithms that can be used to detect and recognize faces, identify objects, classify human actions in videos, track moving objects, extract 3D models of objects, stitch images together, etc.

In this article, we are going to learn some basic image operations in OpenCV using Python which will help you to get started with image processing and computer vision.

Converting Between Color Spaces:

In image processing, color space refers to a specific way of organizing colors. Color space, also known as the color model, is a mathematical model to represent pixel values as a tuple of numbers.

To convert an image from one color space to another, we will use cv2.cvtColor() function. By default, OpenCV reads an image in BGR format. Look at the code below:

import cv2
image = cv2.imread('minion.png')
cv2.imshow('Input Image', image)
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow('Grayscale Image', gray_image)
cv2.waitKey()

The first argument to cv2.cvtColor() function is the input image whereas the second argument specifies the color space conversion, which is BGR format to grayscale in this case.

Here is the input image as well as the corresponding grayscale image:

Input Image in BGR Format
Resulting Grayscale Image
Resulting Grayscale Image

Image Translation:

In computer vision, image translation refers to shifting or moving an image within our frame of reference. To translate an image by X and Y coordinates, we need a transformation matrix to add/subtract from the original image as shown below:

Here x and y are translation values, which means that the image will be shifted by x unit towards the right and by y units downwards. Let’s look at the code below:

import cv2
import numpy as np
image = cv2.imread('minion.png')
image = cv2.resize(image, (500, 500))
height, width = image.shape[:2]
[x, y] = [70, 110]
translation_matrix = np.float32([[1, 0, x], [0, 1, y]])
translated_image = cv2.warpAffine(image, translation_matrix, (width + x, height + y))
cv2.imshow('Translated Image', translated_image)
cv2.waitKey()

Here we are using one of the transformation functions of OpenCV, cv2.warpAffine() for translating image. It takes a 2 by 3 transformation matrix. The first argument of the function is the input image, second argument is the transformation matrix whereas the third argument specifies the size of the output image. If we will use the same size as of input image, the resulting image will get cropped. To avoid cropping, we increased the size by translation values x and y.

When we run the above code, the following image will be generated:

Translated Image

Resizing an Image:

To resize an image, OpenCV provides a resize() function. It takes two required arguments, the input image and the desired size for the output image. Consider the following code example:

import cv2
image = cv2.imread('minion.png')
image = cv2.resize(image, (400, 200))
cv2.imshow('Resized image',image)
cv2.waitKey()
Resized Image

These are some of the functions of OpenCV which are commonly used in different image processing applications. I hope it will give you people a basic understanding of OpenCV and how to use it for image processing in Python. For most advanced functions and operations refer to OpenCV documentation here https://docs.opencv.org/4.2.0/.

Happy Learning!!

--

--