Looking through the eyes of a Computer:
Part II: Edges, Curves and Complex Shapes
In Part I, we showed how images are converted to tensors and how a filter systematically performs dot product multiplication per stride, producing a feature map. We were able to elaborate that filters are designed to detect specific features, such as a horizontal edge. You can review the components here.
In Part II, we will develop vertical and diagonal filters. We will collect the filters and show how they are applied in an efficient manner. We will then utilize complex shapes to see how straight-oriented filters perform in contoured shapes.
We will follow this outline:
Part II. Convolution Application
(D. Visualize results of the Convolution: quick recap)
E. Other Edges
F. Curved Shapes
G. Aggregating Kernels
H. Application
So, open your Notebook, and let’s resume learning how a computer sees …
Part II. Convolution Application
D. Visualize Results of the Convolution: Quick Recap
Codes from Part I that would be useful for Part II
!pip install -Uqq fastbook
import fastbook
fastbook.setup_book()
from fastbook import *
#!pip install fastai -U
import fastai
from fastai.vision.all import *
#follow the prompt for signing-in and authorizationmatplotlib.rc('image', cmap='Greys')
from PIL import Image, ImageOps
square = Image.open('/content/shapes_basic_sq_circ_a/square/002_b9f6c377.jpg')
square = ImageOps.grayscale(square)square_t = tensor(square)top_edge = tensor( [ [1,1,1] ,
[0,0,0],
[-1,-1,-1] ] ).float() #def apply_kernel(data, row, col, kernel):
return (data [ row-1 : row+2 , col-1 : col+2 ] * kernel).sum()rng = range(1, 473) #
top_square = tensor([[apply_kernel(square_t, i, j, top_edge) for j in rng] for i in rng])
In the last step of Part I, we saw that after applying the top_edge filter, we were able to form a band in the feature map in the top portion of the image.
show_image(top_square);
Likewise, we were not able to detect the edges on either the right or left side of the square.
E. Defining and Visualizing the Other Edges
- Right Edge
Defining the right_edge filter.
right_edge = tensor([-1,0,1],
[-1,0,1],
[-1,0,1])right_square = tensor([ [ apply_kernel(square_t, i,j, right_edge) for j in rng] for i in rng])# same rows and columns as Step D.1.b, different filter
right_square[100:110, 420:450]
With the vertically-patterned filter applied to the same rows and columns above, a vertical band is now appreciated.
show_image(right_square);
2. Left Edge
left_edge = tensor([1,0,-1],
[1,0,-1],
[1,0,-1])
left_square = tensor([ [ apply_kernel(square_t, i,j, left_edge) for j in rng] for i in rng])
show_image(left_square);
3. Bottom Edge.
bottom_edge = tensor([-1,-1,-1],
[0,0,0],
[1,1,1])
bottom_square = tensor([ [ apply_kernel(square_t, i,j, bottom_edge) for j in rng] for i in rng])
show_image(bottom_square);
4. Diagonal Edge.
diag_edge = tensor([-1,0,1],
[0,1,-1],
[1,-1,0])
diag_square = tensor([ [ apply_kernel(square_t,i,j, diag_edge) for j in rng] for i in rng])
show_image(diag_square);
F. Application of the edges to curved shapes.
1. Select an image.
(path/'circle').ls()from PIL import Image, ImageOps
circle = Image.open('/content/shapes_basic_sq_circ_a/circle/001_350cc6d2.jpg')
circle = ImageOps.grayscale(circle)
show_image(circle);
2. Visualize the possible edge.
df_c = pd.DataFrame(circle_t)
df_c_top = df_c.iloc[50:70, 120:360]
df_c_top.style.set_properties().background_gradient('gist_heat')
3. Apply the filters.
top_circle = tensor([[apply_kernel(circle_t, i, j, top_edge) for j in rng] for i in rng])
show_image(top_circle);right_circle = tensor([[apply_kernel(circle_t, i, j, right_edge) for j in rng] for i in rng])
show_image(right_circle);left_circle = tensor([[apply_kernel(circle_t, i, j, left_edge) for j in rng] for i in rng])
show_image(left_circle);bottom_circle = tensor([[apply_kernel(circle_t, i, j, bottom_edge) for j in rng] for i in rng])
show_image(bottom_circle);diag_circle = tensor([[apply_kernel(circle_t, i, j, diag_edge) for j in rng] for i in rng])
show_image(diag_circle);
G. Aggregate the kernels.
Collect the kernels to facilitate efficient processing.
edge_kernels = torch.stack([top_edge, right_edge, left_edge, bottom_edge, diag_edge])edge_kernels = edge_kernels.unsqueeze(1)
This gives a shape of [5, 1, 3, 3]: 5 kernels, 1 input channel, 3 x 3 kernel size.
H. Applying the aggregated kernels to more complex shapes.
1. Gather the images.
root = Path().cwd()/'shapes_complex'search = duckduckgo_searchsearch(root, 'snowflake','snowflake ', max_results = 20, img_layout=ImgLayout.All)
search(root, 'heart','heart shape ', max_results = 20, img_layout=ImgLayout.All)
search(root, 'nautilus','nautilus shape ', max_results = 20, img_layout=ImgLayout.All)
search(root, 'CN Tower','CN Tower ', max_results = 20, img_layout=ImgLayout.All)
We will not do modelling in this blog. If you would like to pursue with modelling, increase the max_results to at least 100 each and insert a cleaning step.
2. Forming the DataBlock and Loaders.
For an introduction or refresher on DataBlock, see Steps 6 a-f in Starting the Dive into Deep Learning.
path = Path('/content/shapes_complex')dblock = DataBlock(
(ImageBlock(cls=PILImageBW), CategoryBlock),
get_items = get_image_files,
get_y = parent_label,
item_tfms = Resize(128)
)
dls = dblock.dataloaders(path)dls.show_batch()
3. Get a subset for demonstration.
x, y = first(dls.train)
x will have a shape of [64, 1, 128, 128]: 64 items in the batch, 1 channel, 128 x 128 pixels.
4. Apply the convolution in parallel.
batch_features = F.conv2d(x, edge_kernels)
The previously aggregated kernels can be applied to multiple images (and multiple channels, if needed) simultaneously. This is facilitated by the PyTorch feature F.conv2d and enables efficient use of the GPU.
show_image(x[2]);show_image(batch_features[2][0]); # top_edge
show_image(batch_features[2][1]); # right_edge
show_image(batch_features[2][2]); # left_edge
show_image(batch_features[2][3]); # bottom_edge
show_image(batch_features[2][4]); # diag_edge
It may be appreciated that the ‘shadowing’ seems to have reversed for this set, compared to that of the circle. This is because in the circle, the background is dark and the circle is light. In the shell image, it is the opposite.
5. Looking at more complex shapes.
show_image(x[24]);for i in range(5):
show_image(batch_features[24][i]);
Summary for Part II:
Images are comprised of two-dimensional pixel values and 1 or 3 colour channels. Computers are able to recognized edges in images by applying convolutional filters.
Looking- Forward:
Probing through the strides and padding of the convolutional process, incorporating the RGB channels, and applying the steps in a neural network.
I hope you learned as much as I did!
If you want to dig into the code, you can find it at the GitHub repo for Convolution .
:)
Maria
Follow me in LinkedIn.