A guide to Computer Vision using OpenCV

manan gandhi
AI Skunks
Published in
3 min readDec 31, 2019

Welcome to the computer vision tutorial.This series aims to provide a brief overview of the field of Image Processing and Computer Vision using Python and OpenCV library.

What is Computer Vision?

Computer vision is a cutting edge field of computer science that helps to enable computer to understand the image.It is also referred as a subset of A.I. Computer vision is often misinterpreted as Image Processing,however Image Processing is a part of Computer Vision which deals with some basic tasks like blurring,sharpening,contrast,stretching etc, while computer vision is used to extract features from an image.

What are images?

An image is a 2D representation of an object.In the context of signal processing an image is a distributed amplitude of colors of the visible light spectrum

Light Spectrum

How do computers store an image?

Computers basically saved the RGB value of a pixel of an image.The pixel coordinates are basically represented by (x,y) like in cartesian coordinate or (r,c) like in matrix notation.Where the pixel value ranges from [0–255],here zero represents the darker pixels (Black) while the maximum value 255 represents the brighter pixels (White).

For RGB color space its represented by (x,y,c) where c represents the R,G,B channels.

Unlike the conventional coordinate system in Computer Vision the pixel pixel starts from the top-left corner i.e(0,0) pixel of an image while the last pixel would be(x,y)

The pixels can be stored either in the row major or the column major.

So Let start with our first code…..

Reading,Writing and Displaying our Image

READING AND DISPLAYING THE IMAGE
  1. cv2.imread(‘complete path to image’,flag) First argument is complete path to the image along with the extension. Second argument is an optional flag which you will learn in the next session.

2. Use the function cv2.imshow() to display an image in a window. The window automatically fits to the image size. First argument is a window name which is a string. second argument is our image. You can create as many windows as you wish, but with different window names.

3. cv2.waitKey() is a keyboard binding function. Its argument is the time in milliseconds. The function waits for specified milliseconds for any keyboard event. If you press any key in that time, the program continues. If 0 is passed, it waits indefinitely for a key stroke. It can also be set to detect specific key strokes like, if key a is pressed etc which we will discuss below.

4. cv2.destroyAllWindows() simply destroys all the windows we created. If you want to destroy any specific window, use the function cv.destroyWindow() where you pass the exact window name as the argument.

Writing An Image

Saving/Writing an Image

Use the function cv2.write() saves an image to a specified file. First Argument is Path to the destination on file system, where image is to be saved. Second Argument is ndarray containing image Returns True is returned if the image is written to file system, else False.

In python the numbering starts from zero 
Hence int(image.shape[0]) returns the first argument height,
while int(image.shape[1]) returns the second argument width
and int(image.shape[2]) returns the number of layers for RGB image we know that there are three layers

while for grayscale image it has only 1 layer

--

--