Ten years ago, computer visionary researchers thought that getting a computer to tell the difference between a cat and a dog would be almost impossible. Today, with the significant advance in the state of artificial intelligence, one can do it at a level greater than 99% accuracy. Computer Vision is truly coming of age. Developing new object detection applications with deep learning has become easier than one might think. This is called image classification; the process where you put a label to an image and computers are tasked with understanding the thousands of categories this image falls under.
Dark net, a neural network framework for training and testing computer vision models, has a granularity so high that, in addition to identifying different animals, it can also detect different species. When the classifier runs on an image to classify new records by assigning them bets target attribute, it identifies different categories and when object detection is used, it tries to find all the objects, put bounding boxes around them and categorize what objects are. Computer Vision Algorithms is also capable of identifying objects, the relative locations, their sizes and may even capture additional information not sought for. When it is required to build a system on top of the computer vision, such as self-driving vehicles, robotic systems or a security system; a physical world interactive information is necessary.
One such effective detective system is YOLO (You Only Look Once). It looks at objects only once however, in clever fashion and detects all classes in the image. For this, it is quick (45 frames per second) and accurate unlike the previous algorithms where it undergoes classification for thousands of times. In this detective systems, objects are enclosed in rectangles called bounding boxes in the final output. Initially, it divides the image into a grid of N*N size called cell and each cell is responsible for predicting bounding boxes. YOLO also provides confidence scores about the accuracy of the predicted object present in the bounding box. For each bounding box, cells predict the existence of the bounding box and predicts the type of class the object falls under. The confidence of bounding box and the class prediction are combined into one final score that tells us the probability that the particular bounding box contains a specific type of object.
Although, a single picture may contain multiple bounding boxes, finally only the bounding boxes with higher confidence levels are presented. The threshold of final bounding boxes can then be manually modified. The architecture of YOLO involves convolutional neural network consisting of multiple data containing channels. Channels present in neural network were initialized with bounding boxes and class identifiers data all at a time. Since, these channels are executed in parallel without iterations, YOLO’s robustness and speed has become one of the biggest assets.
The object detection system is widely applied throughout industries in traditional areas of machine inspection and video surveillance or face recognition. On the other hand, future scope of this system is promptly visible in the areas of self-driving cars. The precision and speed of object detection can also make best use in the field of transportation and commutation. The ability of tracking enhances the system to detect the precise location of different cars, while, object detection systems provide information about the size, and the relative distance of obstacles instantaneously.
At Blossom Academy, we believe African countries can make best use of object detection systems in the fields of healthcare and security systems. In the domain of healthcare, medical image analysis can be performed using image extraction or object detection systems for computer vision predictive analytics and therapy. Identification of cancer cells in tissue biopsy may serve as an example for the above technique. On the other side, as the crime rate increases drastically over decades, object detective systems can be used for identifying unattended and unknown objects left in crowded areas such as airports, shopping malls, and many more using background subtraction technique. Although, these techniques are still at research level, it has of great importance and has ability to bring a enormous change in their respective fields- if properly utilized.
Composed by Sahithi Lakamana
Blossom Academy is on a movement to build the next generation of African data scientists. We train and pair data science talent with tech businesses across Africa. #Comeblossom