Published in


[CV] 13. Scale-Invariant Local Feature Extraction(3): SIFT

1. SIFT (Scale Invariant Feature Transform)

  • Similar to Harris-Laplace detector, SIFT searches interest point candidates over all points in the image and scales, and store a list of computed candidates in form of (x, y, σ). This list will be used for later steps.
  • For each interest point candidate, a detailed model is applied to determine the final location and scale of the candidate. Afterward, thresholding is performed to remove unlikely candidates
  • To ensure the rotation invariance, one or more orientations are assigned to each keypoint using the computed image gradient directions. These assigned orientations are later used to match and align the image patches (regions) around interest points.
  • The local image gradients in the patch (also called window or region) around the key point at the selected scale are measured and transformed into a vector representation using a descriptor.

Step (1): Scale-space extrema detection

Figure 6. Automatic scale selection with Laplacian of Gaussian, adapted from [1, 10]
Figure 7. Comparison of Laplacian-of-Gaussian and Difference-of-Gaussian, adapted from [1]
Figure 8. Applying DoG instead of computing LoG response map in each scale space, adapted from [1]
Figure 9. Scale space produced by DoG, from [1, 10]
Figure 10. Produced Octaves by Gaussian pyramid (sequence of consecutive blurring and sampling), from [13]
Figure 11. Approximated scale space by Gaussian pyramid and DoG, adapted from [1, 10]
Figure 12. Local extrema detection, from [13]

Step (2): Keypoint localization and filtering

Figure 13. Visualization of why the keypoint localization is needed, from [14]
Figure 14. Classification of image points using eigenvalues, from [1, 2]
Figure 15. keypoint selection. (a) Given image. (b) 832 keypoints location at extreme(maxima and minima) of DoG function in scale space. (c) After removing extreme with low contrast, 729 keypoints survive. (d) After eliminating edge responses, 536 keypoints remain. from [15]

Step (3): Orientation assignment

Figure 16. Weighted gradient magnitude map by Gaussian kernel which is to be fed into the orientation histogram
Figure 17. Example of detecting the dominant orientation from 8 x 8 region, from [14]

Step (4): Generation of vector representation for interest points

Figure 18. Local keypoint descriptor of SIFT

This is the end of the flow of SIFT.

Figure 18. The usage of computed local feature descriptors for feature mapping, adapted from [1]
Appendix. Scale space construction with Gaussian pyramid and DoG




This publication is for organizing and posting what I’ve studied in scope of CS and Mathematics

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store