Label Recognition with ORB

My journey in Computer Vision world.

Published in

Open Knowledge

5 min readSep 6, 2019

A few days ago I ran in around in Computer Vision field (I’m so new to this field). Then I bump into an interesting algorithm in Recognition technique called Feature Detection and want to make something by using it.

I can’t explain in detail about this algorithm with math and process of how it works in detail, because I’m not an expert in Computer Vision. This is my first attempt in learning and explores in this amazing field. You may want to dive down this habit hole HERE or other resources online that explain much more detail. There are a lot of Feature Detection but I will choose ORB (Oriented FAST and Rotated BRIEF) because it’s fast and not get patented. And here it is, my attempt to explant what I just learn and put it to practice. You can find my project in this GitHub repository here.

Oriented FAST and Rotated BRIEF

ORB (Oriented FAST and Rotated BRIEF) is one of the algorithms in Feature Detection. It was developed from OpenCV Labs and it’s a good alternative to SIFT and SURF. It’s a product of combining between FAST keypoint detector, and BRIEF descriptor and many modifications that help to boost performance.

Yes, SIFT and SURF are patented and you are supposed to pay them for its use. But ORB is not !!!

And the goal is to find feature in image and match those feature with other images without worry about lighting, scale, or orientation of that image. So we can use this as a mechanism to find or detect the target label in the scenes.

Label Recognition

In this example, we will take try to match any wine label in the sense with our train wine label by using ORB that comes with OpenCV-Python as a detector. We will need OpenCV and Numpy dependencies. If you haven’t installed it, open the terminal in your project directory and type:

pip install opencv-python
pip install numpy

After that, we can import OpenCV, Numpy and some other libraries that we will need in the project.

The first order of business is to load our sample wine and train images — reference image that will use to identify wine — so we can compare it with other capture images in a different scene.

And here is the utility function that will help with loading file from.

Now we will define a function that can detect and match those key point.
By default, ORB retains only 500 of maximum features. In my case, I find that 2000 maximum features can give a better result. With the magic of OpenCV, we can detect and compute their description of both images with just two lines of code.

Now that we got all of our key points we can draw it on our image to see all of those.

Features found on our one of the training dataset

Feature Matching

Now we need to find a way to connect the dot to see whether our wine is presented in the query image scene. That’s where Feature Matching come to play. There are two algorithms for Feature Matching:

We will choose FLANN as our matcher for the sake of speed and trade off some accuracy.

As a result, we now match the key point of those two images together as shown in the image below.

Not all the match are good. You can see that a lot of it point to random space and that is not what we want also it’s hard to identify which wine is in the scene too. Because we use knnMatch and our k=2 , we can apply the ratio test explained by D.Lowe in his paper to filter our key point.

Now we can view our result again to see a much cleaner matching.

By performing this process throughout our wine train image and counting all the good match that found in each image, we can say the wine image that contains the most good match is present in the scene.

Result

We have 50 images as our sample query image, 31 images of train data — front and back of the wine — , on 6 wines two of which is very similar to each other. As a result, we get 76% accuracy in predicting what wine is in the sense and take on average 3.5sec to through one sample image on my laptop (Core i5–5200U).

Conclusion

As we can see this result is just good but not good enough. The program can’t predict if the Wine label has difference year or same brand but different Grape varieties and run so slow if our dataset getting bigger. There are a lot more ways to deal with Content-Based Image Retrieval problems like this, such as using OCR to read Wine label or using machine learning and train model for Image recognition. Also, keep in mind that it’s my first attempt testing what I have learned. I might get something wrong or not yet find any add on technique to improve this process.