Machine Learning and Object Detection Application — Part 1
Nowadays, it is increasingly popular to use Machine Learning [1] for various applications [2].
In this article, we aim to explain how to implement an object recognizer with object detection techniques in a very simple way, through a study based on a project for Apple Developer Academy. The primary goal of the project is the development of a machine learning-based iOS application.
The beginning
In 2017, Apple introduced Core ML [3], a breakthrough framework to develop machine learning-based applications. In fact, we will see how to implement machine learning in our iOS app with just a few lines of code, in the “Swift-est” way possible.
From image classification to object detection
The first step in this process was to create an image classifier model [4] to recognize different kinds of fruits, to ensure that if our app receives an input image, it recognizes if the input image is a fruit and which kind of fruit it is. To make the implementation smooth and interesting, the image classifier was replaced with an object detection model [5] to recognize an object (a fruit, in our case) in live streaming mode rather than waiting for user input [6]. To accomplish this, we used two main frameworks: CoreML and Vision [7], this last one uses computer vision algorithms to carry out a task on input images and videos.
Image dataset
Originally, we created an image dataset divided in different folders, each one representing a single object (fruit), and a set of images for each folder (to ensure accuracy in the precision of object recognition; for each folder we provided at least 20–30 images seen in different angles and not greater than 1024x768 resolution in JPEG format). Then, 80% of the images were dedicated to training the computer and the other 20% to testing. It is crucial to separate training images from testing images in two different folders. This schema is used to create an image classifier, but since we chose to create an object detection, there was a need to create a JSON file with the description of every object representing the image, so it was no longer necessary to divide the object in subfolders.
File JSON structure
So we have a JSON array containing two properties for each element: image name and annotations image, which in turn corresponds to another JSON array containing all identifiable objects inside the image and for each object are specified type (label), coordinates inside the images, width and height.
To Be Continued…
You can find the second part of this article here.
A big greeting from Andrea Capone and Gianluca De Lucia, two computer science grads.
Thank you all for reading.
For more information you can contact us to our E-Mails:
Andrea: capone.andrea195@gmail.com
Gianluca: gianluca.delucia.94@gmail.com
Github profile:
Andrea: https://github.com/One195/
Gianluca: https://github.com/gigernau
Fruitable application link:
https://github.com/One195/Fruitable
Resources
[1] Machine Learning — Definition and application examples, https://www.spotlightmetal.com/machine-learning--definition-and-application-examples-a-746226/?cmp=go-aw-art-trf-SLM_DSA-20180820&gclid=Cj0KCQjw1Iv0BRDaARIsAGTWD1swNAKBOUREuv86sCaU1osO-hJXIyPDkLbzGAUnCfHErd3vnbeh5nwaAt6uEALw_wcB
[2] Machine Learning — Applications https://www.geeksforgeeks.org/machine-learning-introduction/
[3] CoreML documentation https://developer.apple.com/documentation/coreml
[4] Creating an Image Classifier Model
https://developer.apple.com/documentation/createml/creating_an_image_classifier_model
[5] Training Object Detection Models in Create ML
https://developer.apple.com/videos/play/wwdc2019/424/
[6] Classifying Images with Vision and Core ML
https://developer.apple.com/documentation/vision/classifying_images_with_vision_and_core_ml
[7] Vision Documentation https://developer.apple.com/documentation/vision
[8] Recognizing Objects in Live Capture
https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture