SIFT feature detection - Image Processing

explanation image-processing python

Feature detection

Feature detection is one of the most important stage of any image processing task. The detecting of unique features in an image allows computer to recognize objects in the image, hence, giving way to more complex task from image stitching, object tracking or even 3D reconstruction.

SIFT (Scale Invariant Feature Transform)

The most basic of feature detectors focuses on finding basic features in the images, this can be in the form of corners (harris detector) or edges (canny detector), these are often features that are affected by scale transform.

SIFT on the other hand, aims to produce scale invariant (not affected by scale) features with descriptors that will perform well in the feature matching stage of the image processing pipeline.

Scaling affects feature detection

As seen above features might look different under different scale. In order to produce features that can be recognized under different scales means that we have to search for the features in different scale space.

Scale space for feature searching

Difference of Gaussian

As you can see from the above images of the cat, prominent features such as the unique shape of the nose and the eyes will remain prominent even after applying a gaussian filter.

By taking a difference of the gaussian filters, we will be able to extract prominent features that will perform well in more complex task such as keypoint matching.

Difference of Gaussian

Scale-space Local Extremas

The features in the images are basically the local extremas within the search window. To obtain the minima and maxima, the target pixel (colored yellow) is compared with a set of other pixels (colored blue). If it is larger or smaller than all the other pixels, those will be marked as the features.

Scale-space local extremas

Orientation detection

As of most feature detectors, SIFT uses the gradient of the image patch to distinguish different features. For each feature, the general direction of the gradient is computed. Keypoint orientation produces rotation invariant features, allowing features to remain distinguishable across different images.

Detecting orientation of a feature

SIFT Descriptors

Descriptors, as the name suggest, are used to describe the features such that in the further stages of the image processing pipeline, the feature matcher will be able to tell apart the different keypoints.

SIFT computes the gradient of small images patches that makes up the feature, generating a orientation in either of the 8 directions. The information in the 16x16 window is then encoded into a 4x4 keypoint descriptor, leading to a 128 (448) dimensional feature vector.

Feature descriptors

Try it out

If you are interesting in trying out SIFT. I would suggest you try out the OpenCV implementation of SIFT feature detection, where you can easily plot out the features. OpenCV is pretty easy to setup, and if you are feeling up to it, there are many tutorials out there that teaches you to use the library to accomplish more complex task such as image stitching and 3D reconstruction.

OpenCV SIFT detection