Learning Day 47: Computer vision before deep learning 1 — Image segmentation
Published in
2 min readJun 1, 2021
Image segmentation without DL
- Based on image’s grayscale, colour, texture and shape
- As compared to DL which depends on semantics (meanings of different areas of an image like background and objects)
1. Sectioning based on threshold
- Choose a valley of the image histogram and cut into half
2. Sectioning based on edge
- Find edges and try to connect the edge together
3. Sectioning based on regions
Region growing
- Starting from 1 point as a group, compare its grayscale value (or average value if there are multiple points in the group) with the surrounding.
- If the difference is smaller than a threshold, include the new point in the group.
Watershed Algorithm
- Convert image to grayscale and find gradients
- Convert edges to peaks and others to valleys
- Consider the similarity of the neighbouring pixels and slowly merge the valleys together
- It is like pouring water to the peaks and valleys and water level slowly raise to cover the valleys and only leave the peaks exposed
4. Sectioning based on graph theory
- Connect pixels in various ways and each connection, the edge, has a weight
- Find the optimal solution of a collection of points where the total weight is minimal
Graph Cuts
- 3 types of connections, all have weights:
- connection between pixels.
- connection of all pixels to a source node, S (foreground).
- connection of all pixels to a sink node, T (background).
- Minimise the energy E(A). The objective is to find the minimal E(A) where it contains a collections of points to make the cut
- R(A) is the penalty on labelling. If a pixel is more likely belonging to foreground (1) than background (0), R(1)<R(0)
- B(A) is the penalty on dissimilarity of two pixels. Since we want to make a cut between dissimilar pixels: If two pixels are similar (eg. in terms of grayscale values), B(A) is big. Otherwise B(A) is small.
GrabCut
- Based on Graph Cuts.
- Combining GMM (Gaussian mixture models) and K-means
- GMM: assumes that the pixel distribution is a combination of more-than-1 gaussian models
- K-means for grouping/clustering
- Steps:
- draw a bounding box to highlight the foreground roughly.
- From this step, it gets the pixel profiles for both foreground and background
2. Use k-means (eg. k=5) to cluster pixels to foreground and background and use Graph Cuts for cutting. This process is done iteratively.