Intern Diaries: Introduction to OpenCV

Tanya Gupta

Published in

Analytics Vidhya

8 min readJan 2, 2021

In this blog, we would be getting a basic idea about OpenCV library.

But before we learn the essentials of OpenCV, we need to understand what computer vision is all about.

Computer Vision

It is a field of study in which machines are imparted with human like attributes for analyzing images. Just like how a human can recognize and analyze different parts of an image, a computer would also be able to do so through computer vision.

Applications

Used in self-driving cars, which enables them to analyze their surroundings in real time and take action accordingly
For visual surveillance
Consumer devices using facial recognition for authentication
Assisting humans in identification tasks (example: a species identification system)

OpenCV

OpenCV (Open Source Computer Vision) is a highly optimized image processing library created by Intel and later supported by Willow Garage. Now, it is maintained by Itseez. Its applications are mainly aimed at real time computer vision. It is a cross-platform library and it works in C , C++ and Python.

How does a computer “see” an image?

The most basic entity of an image is known as a pixel or picture element. It is like a coloured point in an image. But our computer doesn’t see it as a colour but rather in the form of 0’s and 1’s. Digital images are stored as a matrix. When a computer sees a picture it sees it in the form of a pixel matrix and the image resolution is often seen in ppi (or pixel per inch).

Since openCV deals with images which are essentially number matrices, we need to have a good understanding of NumPy library which is a highly efficient tool for manipulation and easy access to matrices and arrays.

I have covered NumPy in my earlier blog, so if you wish you could check it out here.

Types of images

There are two types of digital images:

Grayscale images : Each pixel represents the intensity of only one shade. It represents the amount of light in that image. It appears as a black and white image and thus, it is said to have only one color “channel”.
Coloured images: There are 3 color channels namely, red, green and blue (RGB).

Installation

You can do so using pip command on command prompt just type the following and you are all set:

pip install opencv-python

For Anaconda users, go to anaconda prompt and type in the following:

conda install -c conda-forge opencv

If this doesn’t work then, you can open the anaconda navigator. Go to environments and then choose your environment for installing opencv.
Then type opencv in search packages in not installed packages, and then install them. Now, you are all set.

Choosing an environment for installation

Type “opencv” in search packages and install all the packages.

By selecting installed and typing opencv in search packages, you can check to see whether the installation is completed or not.

Importing openCV

import cv2

Note: When we install openCV in python, NumPy library would also get installed along with it.

Getting Started with openCV

Using this as an example image (Source: scroll.in)

Reading an image:

img = cv2.imread('<full_location_path_of_the_image',<flag value>) where,

First argument is the location path of the image to be read. If your image is saved in the same location where your project or notebook is, then we can just type in the image name.
Flag value: you just have type in either 1, 0 or -1. Value 1 means that the image would be loaded in color mode , value 0 means that it would be loaded in grayscale mode and -1 means the image would be read in its original format including the alpha channel.

Alpha channel: It defines the transparency of a pixel in numerical form. This means that if a pixel has a value of 100% in its alpha channel then, it is completely opaque. With a value of 0%, pixel becomes fully transparent.

Note: Even if the file path or name of the image is wrong, it won’t show any errors. However, on printing its value it would show “None”.

Displaying image:

cv2.imshow('<window name>',<variable in which the image is stored>) where,

window name is a string argument which gives a title to the window which displays your image.
second argument: The variable storing the matrix form of your image

Just using cv2.imshow( ) would only show the image for split second. In order to view it properly, we need to use it with cv2.waitKey( ) and cv2.destroyAllWindows( ).

cv2.waitKey(<time>) where, the argument is time in milliseconds. It waits for the specified time for any keyboard event. If you press any key within that time, then the image window would close. If 0 is passed as an argument, then it would wait indefinitely for a key event.

cv2.destroyAllwindows() destroys all windows we created.

First value is height and second value is width. Since, img variable stores the image in grayscale mode, it only has one channel. This is indicated when we check the shape of our image matrix using numpy library’s shape feature. If it is grayscale image, then it would show only the height and width of the image.

When I put the flag value as 1 (or even -1 ) instead of 0, I got a 3 dimensional matrix for the image. This number 3 denotes the rgb color channels.

Writing an image to a file:

cv2.imwrite('<new file name>',<image to be saved>)

Returns true if the operation is successful

The new image file gets displayed in the Home page of Jupyter Notebook

Drawing geometric shapes on images

Adding a line:

cv2.line(<image>,<start>,<end>,<color>,<thickness of line>) where,

<start> and <end>: tuple denoting the starting and ending coordinates of your line.
<color>: tuple value like ( b , g, r) where the numeric values in the tuple is for blue , green and red color channels respectively.
<thickness of line>: the lowest value is 1.

The above code snippet creates the following image:

For creating a line with any other color, you could have any value in the bgr tuple. For instance having color tuple (200, 0 , 203) gives me:

We can create an arrowed line by using cv2.arrowedLine() as shown below:

Drawing a rectangle:

cv2.rectangle(<image> , pt1, pt2, <color>,<thickness of the border>) where,

pt1: coordinates of the upper left hand corner of the rectangle
pt2: coordinates of the lower right hand corner of the rectangle

If you want to fill the rectangle with the color, instead of thickness value, just type in -1.

Drawing Circle

cv2.circle(<image>,<center coordinates>,<radius>,<color>,<thickness>)

For center coordinates (100,60) , radius = 45 , bgr tuple (150, 36,45) and thickness 4.

Again if we replace thickness value with -1, we will get a filled circle and our dog would now have a “black eye”.

Writing text on image

cv2.putText(<image>,'<text>',<start>,Font_Face, <font size>, <color>,<thickness>, <line_type>)

where, start refer to the starting coordinates of your text and Font_Face and line_type is to be chosen from the list provided by openCV.

You can also create an image out of numpy.zeros( ):

Trivial Things…
So, if we run img.size then it gives the number of pixels in that image and if we run img.dtype it will return its data type.

Splitting the image

We can split the image into their color channels by using:

blue, green , red = cv2.split(img)

Merging the separate channels into one:

cv2.merge((blue, green, red))

Region of Interest (ROI)

There might be times where we are only interested in specific region of an image. For instance, in the above image of a kid studying, I might be interested with the globe. This is what we call ROI.

I took the coordinates of the globe and by slicing from top to bottom and left to right, we get the cropped image of the globe. Then I assigned this cropped image to a new position of the same shape.