Core Machine Learning For iOS Developers
At WWDC 2017, Apple introduced a simple way for developers to add AI capabilities to their iOS application with Core ML. This can be used in a variety of domains like object detection, sentiments analysis, handwriting recognition, music tagging, etc. All this can be integrated with only one single file called Core ML Model and couple lines of code. 😏
What is a Core ML Model ⁉
It is the conversion of a trained model to an Apple formatted model file (.mlmodel) that can be added to your project.
What is a trained Model ‼︎⁉︎
It is the resulted artifact that is created by the training process of a machine. The process of training a model involves providing a learning algorithm with training data to learn from.
So as Core ML Model is the key, I have chosen to use a model provided by Apple instead of converting a trained model myself. For now there are 4 options available: Places205-GoogLeNet, ResNet50, Inception v3, VGG16.
To run the demo app or play around with CoreML, you’ll need to get Xcode9 beta (brace yourself it’s truly beta 😖).
The app will allow a user to select a picture from photo gallery, then the intelligent model will infer the dominant object.
Open Xcode and create a new Single App View project. Go to your Main.storyboard and setup the UI you would like for the app. I’ve picked a simple UI, with one button to open gallery, an image view to show the selected image and a text view to show prediction results.
Then hook the button, image view, and text view to the view controller. Setup a UIImagePickerController afterwards, with its delegate method didFinishPickingMediaWithInfo and retrieve the image from the info dictionary.
At this point your project should be able to run. You can select an image and see it in your image view. Tip: You can open safari on simulator and save some images to create some variation in your gallery.
Add Core ML
Once you get the .mlmodel file, drag and drop it in your project. Make sure that it’s added to your app target (do not add it to test target if you have any, as this won’t generate the model class). Then select the file from your project navigator.
You can see some metadata related to the model. But most importantly you can see the auto-generated model class which will be used in our code. We also know that the model takes an image as input and a dictionary and string as output.
CoreML & Vision
In order to use the intelligence in our code we have to import CoreML and Vision. CoreML is imported so we can use the auto-generated model class in our view controller. And Since image input are of type CVPixelBuffer, Vision will make our lives way easier by giving it a CGImage which will be used as input.
It’s worth noting that Vision is relevant for models that take images as input. For example if our model takes word count then it will be relevant to use NLP (NSLinguisticTagger) instead.
Now back to our view controller. Start with importing CoreML and Vision. Create a new function, let’s call it processImage(). Note that this implementation is specific for Vision, if you are using another model that takes String or Double as input you won’t have to use it this way.
To submit the input, we start by initializing a VNCoreMLModel using our Inceptionv3 model. Then create a VNCoreMLRequest. Finally Setup a VNImageRequestHandler and call perform with the request as parameter.
To retrieve the output, we have to use the VNCoreMLRequest closure which has a request object that contains results property [VNClassificationObservation]. That is it, this is the output. ⭐️✨
Now add processImage() to the delegate method of imagePickerViewController and update the text view. Here is the full code of the view controller under 90 lines of code:
Let’s run the app and see how the prediction results look like. 😍
I’m looking forward to your feedback 🤓