Image Processing using OpenCV — Python

Nimrita Koul
20 min readDec 20, 2023

OpenCV

OpenCV(Open Source Computer Vision Library) is an open source, platform independent library for image processing and computer vision. OpenCV can be used with Python, C++, Java. It was developed by Intel. The library has more than 2500 optimized algorithms, which can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to produce a high resolution image of an entire scene, find similar images from an image database, remove red eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to overlay it with augmented reality, etc.

Installation instructions for all plaforms

Important Modules

Use Python 3.* for this notebook. I am executing this code in Google Colab.

Digital Images

A digital image is a grid of dots or picture elements (pixels).

Pixel: A pixel (picture element) is a single dot in a digital image. It is the smallest portion (building block) of a digital image.

Resolution: Resolution of an image is often measured in dots-per-inch or pixels-per-inch(ppi). Higher the resolution of an image, better it looks. You can zoom into an image to see the individual pixels.

Image Source:Luca Biada on Flickr: https://www.flickr.com/photos/pedroscreamerovsky/7119002433
Image Source: https://en.wikipedia.org/wiki/Pixel#/media/File:Closeup_of_pixels.JPG

Each image has a certain number of pixels along the width and the height of the image. E.g., an 18x18 image has 18 pixels along the width (18 columns) and 18 pixels along the height (18 rows), with a total of 18x18 = 324 pixels in the image.

Each pixel has a color value(black, white, shades of gray or color) that determines how the pixel looks on screen. Color value is 8 bits or 24 bits depending on the color of the pixel.

In OpenCV, images are represented as 3 dimensional numpy arrays. First two dimensions represent the number of pixels along the width and the height of the image (number of columns and rows respectively in the numpy array) and the third dimension represents the depth of color for the image.

Color:

Color of a pixel can be represented in one of the color models like RGB (Red, Green, Blue), grayscale, CMYK(cyan, magenta, yellow,key/black). In the RGB model, we represent the color of a pixel in terms of the combination of three separate values, each representing the value of red, gree and blue colors at that pixel.

Based on the information represented in each pixel, there are 4 main types of images:

  1. Binary or black and white images: Each pixel has one of the two possible values (0 or 1).(One-bit images). In such images, the third dimension of numpy array is 1.
  2. Grayscale images: Each pixel uses 8 bits to store gray value of that pixel. It uses 2 to 8 bits to represent the shade information. They shades of gray are determined by values between 0 and 255. They shades of gray are determined by values between 0 and 255.
  3. Color images: Every pixel in a color image uses three color values red, green and blue to determine its color. Each color value is represented by 8 bits(0 to 255). Thus every pixel has 24 bit color information. The possible range of colors for every pixel in RGB images is 256256256 = 16777216.

Common steps in a deep learning project that uses image data:

a. Image Loading: Load the input images using OpenCV or other libraries.

b. Image Preprocessing: Preprocess images for network input, including resizing, normalization, and data augmentation.

c. Feature Extraction: Extract relevant features from images, often using pre-trained deep learning models (e.g., CNNs).

d. Data Splitting: Divide the dataset into training, validation, and test sets.

e. Model Training: Train the deep learning model using the training dataset.

f. Model Evaluation: Evaluate the model performance on the validation set.

g. Hyperparameter Tuning: Fine-tune model hyperparameters based on validation results.

h. Testing: Assess the model’s performance on the test set to gauge its generalization capability.

i. Post-processing: Apply any necessary post-processing techniques on the model’s output.

Common types of image preprocessing that need to be done during a computer vision project

  1. Resizing: Adjust the image size to meet the input requirements of the deep learning model.
  2. Normalization: Scale pixel values to a standard range (e.g., [0, 1] or [-1, 1]) to enhance model convergence.
  3. Data Augmentation: Generate new training samples by applying random transformations like rotation, flipping, and zooming to increase dataset diversity.
  4. Cropping: Crop images to focus on the region of interest or to obtain consistent input sizes.
  5. Gray Scaling: Convert color images to grayscale, reducing computational complexity and focusing on intensity information.
  6. Histogram Equalization: Enhance image contrast by equalizing the pixel intensity distribution.
  7. Gaussian Blurring: Apply a Gaussian filter to smooth images, reducing noise and preserving important details.
  8. Image Rotation: Rotate images to account for variations in orientation within the dataset.
  9. Noise Reduction:Remove or reduce noise in images through techniques like median filtering or denoising algorithms.
  10. Edge Detection:Highlight edges in images using techniques like the Sobel or Canny edge detectors.
  11. Color Space Conversion: Convert images between color spaces (e.g., RGB to HSV) to emphasize or extract specific color information.
  12. Contrast Adjustment: Adjust image contrast to enhance or normalize brightness levels.
  13. Normalization: Standardize pixel values by subtracting mean and dividing by standard deviation.
  14. Thresholding: Convert images to binary format by setting a threshold, useful for segmenting objects from the background.
  15. Morphological Transformations:Perform operations like dilation and erosion to manipulate image structures.
  16. Image Inversion: Invert pixel values to highlight different aspects of the image.
  17. Centering:Center images or objects within the frame for improved consistency.
  18. Hue, Saturation, and Value (HSV) Adjustment:Modify color components in the HSV color space to control brightness, saturation, and hue.

Above steps of image processing help with computer vision tasks like

  1. Image enhancement: To make images more readable for machines or humans. E.g., improving brightness, contrast, color balancing or correction.
  2. Image restoration: To recover the obscure parts of image like those caused by motion blur, noise etc.
  3. Segmentation: Partitioning an image into multiple objects present in the image.
  4. Representation and description of objects in an image: based on boundaries or pixel values.
  5. Object detection and recognition

Alright, next we will see the code.

The first step is to install OpenCV Python. You can do this using pip.

# Installing from within a Jupyter Notebook or Google Colab
!pip install opencv-python

Then you can import the library opencv-python using the statement

import cv2

cv2 is the name of opencv-python library.

Also import other required libraries like matplotlib.pyplot, numpy etc.

Important Note: opencv-python function cv2.imshow() sometimes faces challenges like kernel crash in Jupyter notebook and Colab.

To prevent this, I will use cv2_imshow() from the package google.colab.patches in this notebook.

If you are executing this in your local machine you can use the below function in place of cv2.imshow() to display images using matplotlib.pyplot.

# First import the libraries cv2 and matplotlib
import cv2
import matplotlib.pyplot as plt
# Read an image file using cv2
img = cv2.imread("path_of_your_imagefile")

# Then below function can display your cv2 image using matplotlib.pyplot.
def cv2_imshow(img):
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.show()

# You can also configure the figure size and other properties of the display.

def cv2_imshow(img):
plt.figure(figsize=(18,18))
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

Since I am running this code in Colab, I don’t need to use cv2.waitKey() and cv2.destroyAllWindows() methods as the images are not displayed in separate windows in our case.

  1. Reading an image file and converting to grayscale and black and white:
# Read the image as it is
import cv2
from google.colab.patches import cv2_imshow
img = cv2.imread("checkerboard_18x18.png",0) # Read the image as it is
cv2_imshow(img)
print(img)
import cv2
from google.colab.patches import cv2_imshow
img = cv2.imread("color.jpg") # Read the image as it is
print("original image")
cv2_imshow(img)

#Convert to grayscale
grayscale = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print("grayscaled image")
cv2_imshow(grayscale)

#Convert grayscale to black and white
(thresh, img_bw) = cv2.threshold(grayscale, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
print('black and white image')
cv2_imshow(img_bw)

You can also use opencv to read and manipulate video files. A video is just a sequence of picture frames which are quickly changing.

Use cv2.VideoCapture() to open a video file.Below cell demonstrates opening a video file and displaying each of its frames separately. In the remaining part of this notebook, we will focus on image related operations only.

import cv2
from google.colab.patches import cv2_imshow
# cv2.VideoCapture() allows you to read a video file or capture video from camera.
vid_capture = cv2.VideoCapture('veryveryshortvideo.mp4')
#Get the frame rate
fps = vid_capture.get(5)
print('Frames per second : ', fps,'FPS')
#get total number of frames
frame_count = vid_capture.get(7)
print('Frame count : ', frame_count)

# Read each frame and display
while True:
ret, frame = vid_capture.read()

# Check if the frame is successfully read
if ret:
cv2_imshow(frame)
else:
# Break the loop if no more frames are available
break

# Release the video capture object
vid_capture.release()
# Close all windows
cv2.destroyAllWindows()

You can also display a cv2 image using matplotlib.

import matplotlib.pyplot as plt
import cv2
def mpl_imshow(img):
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

mpl_imshow(img)

See the shape of numpy array that corresponds to your image

print(img.shape)

#print the entire image array
print(img)

Reading an image from Internet URL

# If you wish to read an image from Internet, use urllib.request module.
import cv2
import urllib.request
import numpy as np
from google.colab.patches import cv2_imshow
req = urllib.request.urlopen("https://raw.githubusercontent.com/NimritaKoul/OpenCV_Tutorial/main/Perfection.png")
arr = np.asarray(bytearray(req.read()), dtype=np.uint8)
img = cv2.imdecode(arr, -1) # 'Load it as it is'
cv2_imshow(img)
# Read the image 
img = cv2.imread('lion2.jpg')
cv2_imshow(img)


# Read the image as grayscale
img_gray = cv2.imread('lion2.jpg', 0)
cv2_imshow(img_gray)

Save your image to disk

#Save your grayscaled image to a file on your disk
cv2.imwrite('img_gray.png', img_gray)

An image is internally represented as a numerical array (3D), let us see an example below:

Let us print intensity values of a few pixels in the image

(B, G, R) = img[0,0]
print(B, G, R)


(B, G, R) = img[17,17]
print(B, G, R)

(B, G, R) = img[17,6]
print(B, G, R)

Let us see some image pre-processing operations

  1. First read an image
#@title Read an image as is and display it
#Imports
import cv2
import numpy as np
from google.colab.patches import cv2_imshow

img = cv2.imread('lion2.jpg')

print("Shape of original image",img.shape)
cv2_imshow(img)

2. Image resizing — downscaling, reducing the size or upscaling , increasing the size

#@title  Image resizing - downscaling, reducing the size or upscaling , increasing the size
new_width = 150
new_height = 150
new_points = (new_width, new_height)
rescaled_img = cv2.resize(img, new_points, interpolation= cv2.INTER_LINEAR)

print("shape of rescaled image", rescaled_img.shape)
# Display images
cv2_imshow(rescaled_img)

3. Cropping an area of image

#@title Cropping an area of the mage

print("Original image shape", img.shape) # Print image shape
print("Original image")
cv2_imshow(img)

# Cropping an image
cropped_image = img[50:250, 550:750]

# Display cropped image
print("cropped image shape", cropped_image.shape) # Print image shape
print("Cropped image")
cv2_imshow(cropped_image)

4. Rotating an image

# First we need to obtain the center of original image by dividing height and width by 2
height, width = img.shape[:2]
print("Height and width of original image", height, width)

# get the coordinates of the center of the image to create the 2D rotation matrix
center = (width/2, height/2)

# using cv2.getRotationMatrix2D() to get the rotation matrix
rotate_matrix = cv2.getRotationMatrix2D(center=center, angle=180, scale=1)

# rotate the image using cv2.warpAffine
rotated_image = cv2.warpAffine(src=img, M=rotate_matrix, dsize=(width, height))

print("Original Image")
cv2_imshow(img)
print("Rotated Image")
cv2_imshow(rotated_image)

5. Annotating Images with line, circle, rectangle, text

#@title Annotating an image with a line
print("Original Image")
cv2_imshow(img)

# Make a copy of the image
imageLine1 = img.copy()
#Decide the coordinates of the line
pointA = (200,180)
pointB = (450,500)
#cv2.line() draws a line
cv2.line(imageLine1, pointA, pointB, (255, 255, 0), thickness=3, lineType=cv2.LINE_AA)

#Draw another line from point C to D
pointC = (50,50)
pointD = (350,300)

cv2.line(imageLine1, pointC, pointD, (255, 255, 0), thickness=3, lineType=cv2.LINE_AA)

print('Image with line')
cv2_imshow(imageLine1)

#@title  Draw a circle on a image
#make a copy of the image
imageCircle = img.copy()
#get the height and width of image
height, width = img.shape[:2]
print("Height and width of original image", height, width)

# get the coordinates of the center of the image
center = (width/2, height/2)
print(center)
# We will draw our circle at the center of the image
circle_center = (int(width/2), int(height/2))
print(circle_center)

# Choose a radius of the circle
radius =100
# cv2.circle() draws a circle
cv2.circle(imageCircle, circle_center, radius, (0, 0, 255), thickness=3, lineType=cv2.LINE_AA)

# Show image with circle
cv2_imshow(imageCircle)
#@title Draw a filled circle in the image
# make a copy of the original image
imageFilledCircle = img.copy()
# choose a center for your circle
circle_center = (650,150)
# choose the radius of the circle
radius =100
# Draw circle
cv2.circle(imageFilledCircle, circle_center, radius, (255, 0, 0), thickness=-1, lineType=cv2.LINE_AA)
# SHow image
cv2_imshow(imageFilledCircle)
#@title Drawing a rectangle on the image
# make a copy
imageRectangle = img.copy()
# define the starting and end points of the rectangle
start_point =(600,40)
end_point =(770,250)
# draw the rectangle
cv2.rectangle(imageRectangle, start_point, end_point, (0, 0, 255), thickness= 3, lineType=cv2.LINE_8)
# display
cv2_imshow(imageRectangle)
#@title add text to image
imageText = img.copy()
text = 'Majestic, Fierce and Free'
org = (50,150) #position of text on the image
cv2.putText(imageText, text, org, fontFace = cv2.FONT_HERSHEY_COMPLEX, fontScale = 1.5, color = (0,0,0))
cv2_imshow(imageText)

Color Spaces

A color space is a specific way of representing the color information of an image. It defines how colors are encoded and stored as numerical values within the image data.

OpenCV supports various color spaces. The default color space of OpenCV is BGR (Blue, Green, Red).

Other common color spaces are:

  1. RGB (Red, Green, Blue): This is the most common color space used in computer vision and image processing. Each pixel is represented by three values (intensity levels) corresponding to the Red, Green, and Blue color channels.
  2. BGR (Blue, Green, Red): Each pixel in a BGR image is represented by three channels: one for blue, one for green, and one for red, each ranging in value from 0 to 255. BGR is suitable for general image processing tasks but can be less intuitive for tasks like color detection or manipulation. BGR is default color space for OpenCV.
  3. HSV (Hue, Saturation, Value): This color space separates the color information into three channels: hue, saturation, and value. Hue represents the “color tint” (e.g., red, green, blue), saturation represents the intensity of the color (e.g., vivid vs. dull), and value represents the brightness (e.g., light vs. dark). HSV is often preferred for tasks like object tracking, color thresholding, and image segmentation.
  4. Lab (CIELAB): In this color space the distance between two points in the roughly corresponds to the perceived difference in color between those points. This makes it ideal for tasks like color matching and image similarity comparison. However, it is computationally more expensive than other color spaces. It consists of three channels: L* (luminance), a* (green to red), and b* (blue to yellow).
  5. YCrCb: This color space represents an image using its luminance (Y) and two chrominance channels (Cr and Cb). YCrCb is commonly used in video compression and processing.
  6. Grayscale: Grayscale is a single-channel color space where pixel values represent the intensity of light.

OpenCV allows you to convert images from one color space to another using functions like cv2.cvtColor().

Working with Color Spaces in OpenCV

First, let us again open an image, since default color space of OpenCV is BGR, this image is opened in BGR:

#@title Working with Color spaces in OpenCV
#let us load an image, by default it will have BGR color space
import cv2
from google.colab.patches import cv2_imshow

img = cv2.imread('lion2.jpg')
cv2_imshow(img)

Next, we will convert the image to LAB color space

#Convert BGR color space to LAB color space
imgLAB = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
cv2_imshow(imgLAB)
##Convert BGR color space to HSV color space
imgHSV = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
cv2_imshow(imgHSV)

Let us segment the image using color spaces

#@title Let us segment the image using color spaces
bgr = [100, 150, 1] #target color in BGR format
thresh = 140 #threshold value for color segmentation

#lower bound for color segmentation in BGR
minBGR = np.array([bgr[0] - thresh, bgr[1] - thresh, bgr[2] - thresh])
#upper bound for color segmentation in BGR
maxBGR = np.array([bgr[0] + thresh, bgr[1] + thresh, bgr[2] + thresh])
#a binary mask where the pixels within a specific range are set to white and others are set to black
maskBGR = cv2.inRange(img,minBGR,maxBGR)
#do bitwiseAND between image and mask to only keep the pixels in the specified color range
resultBGR = cv2.bitwise_and(img, img, mask = maskBGR)
cv2_imshow(resultBGR)

Segment using HSV color space:

#@title  Now let us segment the HSV space image
#convert 1D array to 3D, then convert it to HSV and take the first element
'''
np.uint8([[bgr]]) creates a NumPy array of type uint8 containing the BGR color value.
However, the cv2.cvtColor() expects an array with shape (1, 1, 3) for a single pixel in BGR format.
So, [[bgr]] is used to create a 2D list with one element ([bgr]),
and then np.uint8([[bgr]]) converts it to a NumPy array of shape (1, 1, 3).
Now, cv2.cvtColor(np.uint8([[bgr]]), cv2.COLOR_BGR2HSV) converts this BGR color to HSV format,
resulting in an array with the shape (1, 1, 3), where the third dimension corresponds
to the HSV channels (Hue, Saturation, Value).
[0][0] is used to get the HSV values of the single pixel (first) in the resulting array.
'''
hsv = cv2.cvtColor(np.uint8([[bgr]] ), cv2.COLOR_BGR2HSV)[0][0]

minHSV = np.array([hsv[0] - thresh, hsv[1] - thresh, hsv[2] - thresh])
maxHSV = np.array([hsv[0] + thresh, hsv[1] + thresh, hsv[2] + thresh])

maskHSV = cv2.inRange(imgHSV, minHSV, maxHSV)

resultHSV = cv2.bitwise_and(imgHSV, imgHSV, mask = maskHSV)

cv2_imshow(resultHSV)

Image Normalization

#@title Image Normalization
import cv2
import numpy as np
from google.colab.patches import cv2_imshow

def normalize_image(image):
# Convert the image to float32
img_float32 = image.astype(np.float32)

# Normalize the image to the range [100, 200]
# the intensity values of all pixels now will range from 100 to 200 only instead of 0 to 255
normalized_image = cv2.normalize(img_float32, None, 100, 255, cv2.NORM_MINMAX)

return normalized_image

# Read an image from file
img = cv2.imread('lion2.jpg')

# Ensure the image is not empty
if img is not None:
# Display the original image
cv2_imshow(img)

# Normalize the image
normalized_img = normalize_image(img)

# Display the normalized image
cv2_imshow(normalized_img)

Generate images similar to an input image (data augmentation)

#@title Generate images similar to an input image (data augmentation)

import cv2
from google.colab.patches import cv2_imshow
import numpy as np

def generate_similar_image(reference_image, noise_factor=0.5):
# Generate random noise from normal distribution
noise = np.random.normal(scale=noise_factor, size=reference_image.shape).astype(np.uint8)

# Add noise to the reference image
similar_image = cv2.add(reference_image, noise)

return similar_image

# Read an image from file
reference_img = cv2.imread('lion2.jpg')

# Ensure the reference image is not empty
if reference_img is not None:
# Generate a similar image with random noise
similar_img = generate_similar_image(reference_img)

# Display the original and similar images
cv2_imshow(reference_img)
cv2_imshow(similar_img)

Histogram Equalization

#@title Histogram Equalization

#Histogram equalization is a technique used to enhance the contrast of an image
#by adjusting the intensity values based on the cumulative distribution function
#of the pixel intensities.
import cv2
import numpy as np
from matplotlib import pyplot as plt
from google.colab.patches import cv2_imshow

# Read an image from file
img = cv2.imread('lion2.jpg', cv2.IMREAD_GRAYSCALE)

# Ensure the image is not empty
if img is not None:
# Perform histogram equalization
equalized_img = cv2.equalizeHist(img)

# Display the original and equalized images side by side
plt.figure(figsize=(10, 5))

plt.subplot(1, 2, 1)
plt.axis('off')
plt.imshow(img, cmap='gray')
plt.title('Original Image')

plt.subplot(1, 2, 2)
plt.imshow(equalized_img, cmap='gray')
plt.title('Equalized Image')
plt.axis('off')
plt.show()
else:
print("Error: Could not read the image.")

Image Filtering using convolutional kernels

#@title Image Filtering Using Convolution in OpenCV
#The identity kernel leaves the image unchanged since it acts as a filter that preserves the original pixel values.
import cv2
import numpy as np
from google.colab.patches import cv2_imshow
img = cv2.imread('lion2.jpg')

# define an identity filter or kernel
kernel1 = np.array([[0, 0, 0],
[0, 1, 0],
[0, 0, 0]])

#Apply the kernel to image
identity = cv2.filter2D(src=img, ddepth=-1, kernel=kernel1)
cv2_imshow(identity)

Blurring Kernel

#@title  Apply blurring kernel
#blurring kernel here is a floating point type 5x5 matrix of all 1's, it then normzlizes
# values by dividing them by 25 (size of the matrix)

#The blurring kernel performs a simple averaging operation over a 5x5 neighborhood,
# resulting in a smoothed or blurred version of the image.
kernel2 = np.ones((5, 5), np.float32) / 25

#apply kernel
img = cv2.filter2D(src=img, ddepth=-1, kernel=kernel2)

cv2_imshow(img)
cv2.imwrite('blur_kernel.jpg', img)

Median Blur

#@title Applying Median blur to an image
'''
Median blur is a type of non-linear filtering.
It replaces each pixel value with the median value of its neighborhood.
src is the image file, ksize is the kernel size - the size of neighborhood window. It must be an odd integer.
'''
median = cv2.medianBlur(src=img, ksize=5)
cv2_imshow(median)

Image Sharpening with a kernel

#@title Sharpening an image using a kernel
kernel3 = np.array([[0, -1, 0],
[-1, 5, -1],
[0, -1, 0]])
sharp_img = cv2.filter2D(src=img, ddepth=-1, kernel=kernel3)

cv2_imshow(img)
cv2_imshow(sharp_img)

Bilateral Filtering

#@title Bilateral Filtering

'''
Bilateral filtering is a non-linear filtering technique that preserves edges while reducing noise
Arguments of the function bilateralFilter() are,
src: The input image.
d: Diameter of each pixel neighborhood. It should be an integer,
and the neighborhood size is (2 * d + 1) x (2 * d + 1).
sigmaColor: Filter sigma in the color space. A larger value of sigmaColor means
that farther colors within the pixel neighborhood will be mixed together,
producing a more blurred effect in the color space.
sigmaSpace: Filter sigma in the coordinate space. A larger value of sigmaSpace
means that pixels farther away from the central pixel will have less influence on the filtering.
'''

bilateral_filter = cv2.bilateralFilter(src=img, d=9, sigmaColor=75, sigmaSpace=75)
cv2_imshow(img)
cv2_imshow(bilateral_filter)

Image thresholding a grayscale image to black and white image

#@title Image Thresholding a grayscale image to black and white
'''
Image thresholding is a common image processing technique used to separate objects
or regions of interest from the background by converting a grayscale image into a binary image.
'''
img_grayscale = cv2.imread("lion2.jpg", cv2.IMREAD_GRAYSCALE);
cv2_imshow(img_grayscale)
# Basic threhold example
th, dst = cv2.threshold(img_grayscale, 127, 255, cv2.THRESH_BINARY);
cv2_imshow(dst)

Try some more thresholding methods:

# Thresholding with threshold value set 127
th, dst = cv2.threshold(img_grayscale,127,255, cv2.THRESH_BINARY);
cv2_imshow(dst)
# Thresholding using THRESH_BINARY_INV
'''
cv2.THRESH_BINARY_INV creates the inverse of the binary image,
where pixel values above the threshold are set to zero, and values below or equal
to the threshold are set to a maximum value.
'''

th, dst = cv2.threshold(img_grayscale,127,255, cv2.THRESH_BINARY_INV)
cv2_imshow(dst)
# Thresholding using THRESH_TRUNC
'''
This particular thresholding method truncates (sets to the threshold value)
pixel values that exceed a specified threshold and leaves the pixel values
unchanged if they are below or equal to the threshold.
'''
th, dst = cv2.threshold(img_grayscale,127,255, cv2.THRESH_TRUNC)
cv2_imshow(dst)
# Thresholding using THRESH_TOZERO
# cv2.THRESH_TOZERO sets pixel values to zero if they are above the threshold else leaves them unchanged

th, dst = cv2.threshold(img_grayscale,127,255, cv2.THRESH_TOZERO);
cv2_imshow(dst)
# Thresholding using THRESH_TOZERO_INV
# cv2.THRESH_TOZERO_INV sets pixel values to zero if they are below or equal to the threshold else leaves them unchanged
th, dst = cv2.threshold(img_grayscale,127,255, cv2.THRESH_TOZERO_INV);
cv2_imshow(dst)

Edge detection using Sobel Operator

The Sobel operator is a fundamental edge detection algorithm. It operates by convolving the image with a pair of 3x3 convolution kernels,one for detecting edges in the horizontal (X) direction and the other for the vertical (Y) direction.

Typical SobelX kernel looks like this:
| -1 0 1 |
| -2 0 2 |
| -1 0 1 |
and SobelY kernel looks like this:
| -1 -2 -1 |
| 0 0 0 |
| 1 2 1 |

These kernels are used to compute the gradient of the image intensity in the X and Y directions, respectively.

In OpenCV, the cv2.Sobel function is used to apply the Sobel operator.

Its parameters are:

src: The source image (grayscale).
ddepth: The depth of the output image. Use cv2.CV_64F for 64-bit floating-point precision.
dx: The order of the derivative in the X direction (typically 1 for SobelX).
dy: The order of the derivative in the Y direction (typically 0 for SobelX).
ksize: The size of the Sobel kernel (typically 3 or 5).

The Sobel operator is often used as a pre-processing step for more advanced edge detection methods or as a feature in various computer vision applications. It helps to highlight the regions in an image where the intensity changes rapidly, which often corresponds to edges or boundaries between different objects or textures.

import cv2
import numpy as np
from google.colab.patches import cv2_imshow

img = cv2.imread('lion2.jpg')
# Display original image
print("Original Image")
cv2_imshow(img)

# Convert to graycsale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Apply Gaussian blur to the grayscale image using cv2.GaussianBlur
# with a kernel size of (3,3) to smooth the image and reduce noise.

img_blur = cv2.GaussianBlur(img_gray, (3,3), 0)

# Sobel Edge Detection
sobelx = cv2.Sobel(src=img_blur, ddepth=cv2.CV_64F, dx=1, dy=0, ksize=5) # Horizontal Edges
sobely = cv2.Sobel(src=img_blur, ddepth=cv2.CV_64F, dx=0, dy=1, ksize=5) # Vertical Edges
sobelxy = cv2.Sobel(src=img_blur, ddepth=cv2.CV_64F, dx=1, dy=1, ksize=5) # Horizontal and vertical edges

# Display Sobel Edge Detection Images
print("Sobelx edges")
cv2_imshow(sobelx)
print("Sobely edges")
cv2_imshow(sobely)
print("Sobelxy edges")
cv2_imshow(sobelxy)

Canny Edge detection

Steps in Canny edge detection procedure:
1. The input image is first smoothed using Gaussian Bllur to reduce noise.
2. Then the gradient of the image is calculated using Sobel operator to find the intensity gradients in
both X and Y directions.
3. The magnitude and direction of the gradient are computed using the calculated gradients
in the X and Y directions.
4. Only the local maxima in the gradient magnitude are preserved,
and non-maximum values are suppressed. This step ensures that only the most significant
edges are retained.
5. The edges are further refined by applying hysteresis thresholding.
Two threshold values, threshold1 and threshold2, are used. If a pixel's gradient value
is above threshold2, it is considered a strong edge pixel.
If it is below threshold1, it is considered a non-edge pixel.
Pixels with gradient values between the two thresholds are considered weak edge pixels unless
they are connected to strong edge pixels.
6. The final result is a binary image where edges are marked with white pixels,
and non-edge regions are marked with black pixels.
edges = cv2.Canny(image=img_blur, threshold1=100, threshold2=200) 
cv2_imshow(edges)

Contour Detection

Contour detection means identification and extraction of boundaries of objects in an image. They represent shape and structure of objects.

In OpenCv, cv2.findContours() is used for contour detection. It takes a binary image as input and outputs a list of contours along with hierarchy information.

Function cv2.findContours()
Arguments :
binary_image: The binary image obtained after preprocessing.
mode: e.g. cv2.RETR_EXTERNAL: Retrieves only the external contours, cv2.RETR_TREE retrieves all contours
method: e.g. cv2.CHAIN_APPROX_SIMPLE: Compresses horizontal, vertical, and diagonal segments, and leaves only their end points.


Function cv2.drawContours() is used to draw contours on image
Arguments:
original_image: The original image.
contours: The list of contours.
index of contours: e.g., -1: Draw all contours, 3 - individual 4th contour etc.
color used to draw contour: e.g.,(0, 255, 0): Color of the contours (green in BGR format).
Thickness of contour line: E.g.,2.
img = cv2.imread('lion2.jpg')
# convert the image to grayscale format
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# apply binary thresholding
ret, thresh = cv2.threshold(img_gray, 150, 255, cv2.THRESH_BINARY)
cv2_imshow(thresh)

# detect contours using cv2.findContours() method cv2.CHAIN_APPROX_NONE
#cv2.findContours() takes a binary image as input and produces a list of contours along with hierarchy information

contours, hierarchy = cv2.findContours(image=thresh, mode=cv2.RETR_TREE, method=cv2.CHAIN_APPROX_NONE)

#make a copy of image
image_copy = img.copy()
# draw contours on the original image
cv2.drawContours(image=image_copy, contours=contours, contourIdx=-1, color=(0, 255, 0), thickness=2, lineType=cv2.LINE_AA)
cv2_imshow(image_copy)

References

https://docs.opencv.org/3.4/d5/de5/tutorial_py_setup_in_windows.html

https://developer.ibm.com/articles/learn-the-basics-of-computer-vision-and-object-detection/

https://learnopencv.com/getting-started-with-opencv/

https://colab.research.google.com/github/farrokhkarimi/OpenCV/blob/master/Getting_Started_with_OpenCV.ipynb#scrollTo=QbG6EQTLRhrE

https://courses.opencv.org/courses/course-v1:OpenCV+Bootcamp+CV0/courseware/

https://www.opencvhelp.org/tutorials/advanced/

https://realpython.com/tutorials/computer-vision/

https://opencv-tutorial.readthedocs.io/_/downloads/en/latest/pdf/

http://preservationtutorial.library.cornell.edu/intro/intro-01.html#:~:text=DIGITAL%20IMAGES%20are%20electronic%20snapshots,or%20picture%20elements%20(pixels).

https://www.bogotobogo.com/cplusplus/files/OReilly%20Learning%20OpenCV.pdf

https://www.classcentral.com/course/freecodecamp-opencv-course-full-tutorial-with-python-57812

https://www.kaggle.com/code/talhabu/opencv-tutorial-from-basic-to-advanced

https://pyimagesearch.com/2018/07/19/opencv-tutorial-a-guide-to-learn-opencv/

--

--