A Beginners Guide to Computer Vision (Part 1)-Filtering
Computer vision is a fascinating field to study. It has a long history of developments by many researchers some, inspired from neuroscience and some from came through general intuition.
I started to learn about computer vision as a part of my research on building a weld defects classifier. So there is great chance that this blog may contain some weld radiography images as examples.
What is an Image?
In Computer Vision, We interpret image as a 2D array which contains the value of intensities at each pixel. For example, If we have a image of size 720X500 that means we need a 2D array with 720 rows and 500 columns to store it.
- Binary Image: In binary image, the intensity values will be either 1 or 0.
2. Grey Scale Image: In grey scale, the intensity value will range between 0 (Black) and 255(White).
3. RGB image: In grey scale images, we have only one grey channel but in RGB images, we overlap three channels which contain the intensity of red, green and blue which then results into a color image. There is also an other variant called RGBA here, ‘A’ represent the alpha channel (transparency).
Image Noise
Noise is the difference of intensity captured by camera and actual intensity. Light variations, camera electronics, lens and surface reflectance are some of sources of noise in images. Obtaining the actual intensity is impossible because we are never going into know the noise involved at every pixel.
Noise is random but it has some probability. We are going to exploit this phenomena to reduce it’s impact over our analysis. Knowledge on correlation and weighted mean are important to understand the concept of filtering.
Correlation
Correlation is the sum of matrix obtained after dot product of two matrices.
Mean(Average) mask
What happens when we apply this mask to the image? (By applying I mean finding the correlation)
We reduce the noise!! How?
Well, noise is random but it follows a distribution most probably normal distribution. The chance of the neighboring pixels affected by noise will be less. By finding the average of neighboring pixels and replacing it with the pixel under consideration will indeed reduce the affect of noise.
from skimage.color import rgb2graymean_mask = np.ones((7,7))
grey_img = rgb2gray(img)
mean_img = ndimage.convolve(grey_img, mean_mask, mode = 'constant', cval = 0.0)
We can also use other masks which can give more importance to some cells. For example, if you replace the center cell of mean mask from 1 to 2 then that mask will give more importance to the central pixel than it’s neighbors.
What is Gaussian distribution or Normal distribution?
Think of plotting a histogram of height of students in your class room, you will get the peak at average height(5 ft 5 in) because more student will be at that height. You will also notice that students having less height (below 5ft) and students having more height (above 6) are very less in number. Now you can realize that histogram you plotted will be similar to one in the below image. Not only the heights but many other natural processes follow Gaussian distribution. Even noise in image also follow Gaussian distribution. Mean of that distribution will be 0. That means most of the pixels in an image will have no noise in them and pixels with more noise will be less in number.
Gaussian Mask
Gaussian mask is generated by plotting the x and y values using the function above. An example of 3X3 Gaussian mask with sigma = 3.0 is shown below.
# Gaussian 3X3 mask with sigma=3.0
[[0.89483932, 0.94595947, 0.89483932],
[0.94595947, 1. , 0.94595947],
[0.89483932, 0.94595947, 0.89483932]]
Gaussian mask is most widely used then mean mask. By using Gaussian mask you can get rid of noise and preserve the more details in the image comparatively.
Above images are generated by the code snippet below.
from scipy import signaldef gkern(kernlen=21, std=3):
gkern1d = signal.gaussian(kernlen, std=std).reshape(kernlen, 1)
gkern2d = np.outer(gkern1d, gkern1d)
return gkern2dsizes=[3,5,7]
for i in sizes:
gaussian_img = ndimage.convolve(grey_img, gkern(i,3), mode = 'constant', cval = 0.0) #sigma = 3.0
plt.imshow(gaussian_img, cmap = plt.get_cmap('gray'))
plt.show()
Yeah that’ s the end of the chapter filtering..
Want to implement yourself, here is the link
Stay tuned to experience this amazing journey of computer vision.