Shivam
GDSC GHRCE
Published in
4 min readAug 21, 2020

--

Let’s Open Computer Vision

From the concatenation of two terms computer and vision, a self-explanatory term Computer vision appears, often referred to as CV. This is none other than computing digital images to attempt to replicate the maximum abilities of Human vision. This technology came into reality in 1966, Seymour Papert and Marvin Minsky after they decided to code a program for identifying objects.

One common notion of people is assuming Image processing and Computer vision to be similar, but the real picture is somewhat different from what we assume. While Image processing is a transformation technique in which image is processed with smoothing, sharpening, contrasting, and various others to produce output which could be used in other tasks, Computer vision understands images and draw insights from it. Computer vision may use image processing for some process within itself, something just similar to when we differentiate Machine Learning with Artificial Intelligence.

OpenCV

About 20 years back, Intel developed a library called Open Source Computer Vision Library (OpenCV) for performing real-time computer vision. OpenCV is written in C++ but there are bindings in several other languages such as JAVA, PYTHON, MATLAB. OpenCV runs on both computer and mobile operating system Windows, macOS, Linux, Android, iOS, BlackBerry 10. OpenCV has a dedicated repository in GitHub for the latest updates and contributions. It has several version releases over the year, out of which version 4.3.0 released on 3 April 2020 is the last stable update. OpenCV is easily integrable with other libraries like SciPy and Matplotlib as in python as it uses NumPy as its building blocks. Now, the support for OpenCV is managed by non-profit organization OpenCV.org which also maintains a user site.

Installing OpenCV

Before installation, it is recommended to have NumPy and Matplotlib installed.

Run pip install opencv-python in the command line.

Importing and checking version

import cv2cv2.__version__

Reading and Displaying Images:

For any images, there are three primary colors Red, Green, and Blue represented by values in the range 0 to 255 for each color. A matrix is formed for each primary color which later sums up to provide a pixel value of R, G, B colors. In simple language, each color makes a sheet, and a combination of such three layers results in an image.

img=cv2.imread(‘opencv.jpg’)cv2.imshow(‘opencv,img)
Image read through OpenCV
The image read through OpenCV

Images can be read in three modes by passing integers -1,0,1 for unchanged, grayscale, and colored mode respectively. By default, the value of the flag is 1.

Example:

 img=cv2.imread(‘demo.jpg’,0) for loading image in grayscale mode.
The image read in Grayscale mode

cv2.waitKey() is a keyboard bound function in which the argument passed is in milliseconds. If an argument passed is 0, it will wait infinitely and for other value, it will wait for that particular time and if any key is pressed during that duration, the program will proceed.

cv2.destroyAllWindows() closes all the windows created during the execution of the program.

Resizing the image

img=cv2.imread(‘opencv.jpg’)resized_image = cv2.resize(img, (150,80))

Detecting Edges Of Image:

We can choose a mode to display only edges by using the canny function under OpenCV.

import cv2img=cv2.imread(“opencv.jpg”)cv2.imwrite(“edged_py.jpg”,cv2.Canny(img,400,200))cv2.imshow(“edged image”,cv2.imread(“edged_py.jpg”))
Edge detection using OpenCV

Detecting Face using OpenCV and using Matplotlib to display output:

OpenCV comes with an in-built classifier and detector using Cascade Classifier Training which uses XML files of different classified classes.

Matplotlib is a python library for creating interesting visualization.

from PIL import Imageimport numpy as npimport matplotlib.pyplot as pltimport cv2frame = cv2.imread(“demo.jpg”)classifier = cv2.CascadeClassifier(“haarcascade_frontalface_default.xml”)

Click here to download the haarcascade frontalface dataset.

detectMultiscale is used to detect objects of different sizes in the input image, it returns result as a list of rectangles.

faces = classifier.detectMultiScale(frame)face = faces[0]x, y, w, h = faceout = cv2.rectangle(rgb, (x, y), (x+w, y+h), (0, 0, 255), 4)plt.imshow(out)

Live Image Capture using OpenCV

VideCapture is used to capture video and the argument here is meant for the index of the device or name of the device which confirms the respective camera to be used.

import cv2webcam = cv2.VideoCapture(0)

webcam.read passes two arguments ret and frame. Ret is true or false which tells image is read or not and Frame is valued if the image is passed.

release is used to stop the use of webcam.

ret, frame = webcam.read()webcam.release()cv2.imshow(“my image”, frame)cv2.waitKey()cv2.destroyAllWindows()

Conclusion:

I hope this tutorial would help all the readers to get started with OpenCV as it has all the ingredients, you would need for preparing a great dish. It covers reading, converting, resizing, displaying, detecting, and fetching live images. So now its time to explore the world with a new vision “Computer Vision”.

--

--

Shivam
GDSC GHRCE

BTech in Artificial Intelligence |Core team member at DSC GHRCE