Kick start with Machine Learning in iOS apps.
CoreML: Learning with Experience
Whats is Machine Learning?
Machine Learning is a sub-branch of AI where we create and train models by feeding sufficient, accurate and relevant data to machine learning algorithm.
Then we evaluate created model using test data set and check the prediction and accuracy level. We keep repeating this process until we get the right prediction with acceptable accuracy. It takes a lot of time and effort to come up with the final successful model.
Apple has saved above effort for us by providing lot of successfully trained models at Apple Developer websites and a framework (CoreML) to use them.
What is CoreML?
CoreML is a machine learning framework provided by Apple to integrate machine learning model into iOS apps. It is the foundation framework for all domain specific ML framework and functionalities provided by Apple.
*Vision for analysing images. *Natural Language for processing text
*Speech for converting audio to text. *Sound Analysis for identifying sounds in audio.
How CoreML works?
CoreML uses a specific model format (.mlmodel) to run machine learning models on Apple devices.
So, in order to work with CoreML, machine learning model has to be compatible with CoreML. Third party models like TensorFlow, Caffe , ONNX has to be converted to .mlmodel using coremltools
to make them compatible with CoreML.
Now, Let’s see CoreML in action
Let’s create a simple app to demonstrate CoreML in working. We will try to use MobileNetV2
(Image Classification) model provided by Apple to detect the object in an image.
Step#1: Create a new iOS project and name it as ImageClassification.
Step#2: Add CoreML model to Xcode Project:-
Download MobileNetV2
model from here and drag it to your iOS Project. Soon after you drag a model to Xcode project, it automatically generates a wrapper class to use that model. Go to your Xcode project then select the model just dragged to see the wrapper class details.
In most of the cases we use this wrapper class to get observations. However in some cases when there is a need to access underlying MLModel interface, just use wrapper class’s model property. Every model has predefined Input and Output to use it. e.g. MobileNetV2
model as below input & output
We can either directly use MobileNetV2
wrapper class and handle input/output formatting ourselves ( like resizing image to 224 x 224 or converting UIImage to CVPixelBuffer etc.) OR just use Vision framework which takes image classification model as input and provide observations (VNClassificationObservation) to us. In this example I used Vision framework to show top two observations on screen.
Step#3 Working with Vision framework: -
Let me highlight 4 main steps involved in working with Vision framework which internally interacts with CoreML to make observations. Same steps are used in working with any other image classification framework.
Below code is also available in sample code provided by Apple. I just copy pasted it in my sample project and made small changes as per my need.
#3.1: Initialise Vision core model VNCoreMLModel
by providing an instance of model wrapper class MobileNetV2
.
class ImagePredictor {
private static let imageClassifier = createImageClassifier()
/// - Tag: name
static func createImageClassifier() -> VNCoreMLModel {
// Use a default model configuration.
let defaultConfig = MLModelConfiguration()
// Create an instance of the image classifier's wrapper class.
let imageClassifierWrapper = try? MobileNetV2(configuration: defaultConfig )
guard let imageClassifier = imageClassifierWrapper else {
fatalError("App failed to create an image classifier model instance.")
}
// Get the underlying model instance.
let imageClassifierModel = imageClassifier.model
// Create a Vision instance using the image classifier's model instance.
guard let imageClassifierVisionModel = try? VNCoreMLModel(for: imageClassifierModel) else {
fatalError("App failed to create a `VNCoreMLModel` instance.")
}
return imageClassifierVisionModel
}
}
#3.2 Create Vision image request VNImageBasedRequest
(sub-class of VNRequest) using the Vision Core Model created in step 1.
/// Generates a new request instance that uses the Image Predictor's image classifier model.
private func createImageClassificationRequest() -> VNImageBasedRequest {
// Create an image classification request with an image classifier model.
let imageClassificationRequest = VNCoreMLRequest(model: ImagePredictor.imageClassifier,
completionHandler: visionRequestCompletionHandler)
imageClassificationRequest.imageCropAndScaleOption = .centerCrop
return imageClassificationRequest
}
#3.3: Create vision request handler VNImageRequestHandler
and execute the request created in step 2.
// Start the image classification request.
func makePredictions(for photo: UIImage, completionHandler: @escaping ImagePredictionHandler) throws {
guard let photoImage = photo.cgImage else {
fatalError("Photo doesn't have underlying CGImage.")
}
let orientation = CGImagePropertyOrientation(photo.imageOrientation)
let imageClassificationRequest = createImageClassificationRequest()
predictionHandlers[imageClassificationRequest] = completionHandler
// Create vision request handler
let handler = VNImageRequestHandler(cgImage: photoImage, orientation: orientation)
let requests: [VNRequest] = [imageClassificationRequest]
// Start the image classification request.
try handler.perform(requests)
}
#3.4: Once the request is completed, visionRequestCompletionHandler will be called where we used map function to cast observations to prediction structures that will be eventually passed to view to handle display.
// Cast the request's results as an `VNClassificationObservation` array.
guard let observations = request.results as? [VNClassificationObservation] else {
// Image classifiers, like MobileNet, only produce classification observations.
// However, other Core ML model types can produce other observations.
// For example, a style transfer model produces `VNPixelBufferObservation` instances.
print("VNRequest produced the wrong result type: \(type(of: request.results)).")
return
}
// Create a prediction array from the observations.
predictions = observations.map { observation in
// Convert each observation into an `ImagePredictor.Prediction` instance.
Prediction(classification: observation.identifier,
confidencePercentage: observation.confidencePercentageString)
}
Thats it! Simple, Isn’t it? Check out the complete source code here and do some experimentation. Try to change the model from MobileNetV2
to any other image classification models like InceptionV3 or EfficientNet and compare the results and performance.
Now, lets look at some of the benefits and limitations of using CoreML
Benefits of CoreML
- Offline Support: As the model is in person’s device there is no need of network to perform the prediction.
- Data Privacy: As no data leaves app so person’s data is completely private.
- Optimised Performance: for iOS devices through leveraging the CPU, GPU, and Neural Engine while minimizing its memory footprint and power consumption.
Disadvantages of CoreML
- Increased App Size: Some models size can be very large which eventually can increase the download size of the app and reduce the app performance on lower model of iOS devices.
- Out dated Models may require app update: CoreML has limited support for on-device model training. It provides support to update only certain types of model. So, there may be a need to update an app to update the model.
- Limited third party model support: As of now CoreML only supports conversion from limited number of model formats life TensorFlow, Caffe, Keras etc.
What are the alternatives?
So, What if we can’t use CoreML because of its limitations or our special use-case.
We can choose below options based on our need.
- [On-Device] TensorFlow Lite: is a lightweight version of TensorFlow designed for mobile and embedded devices.
- [On-Device] ONNX Runtime: (Open Neural Network Exchange) Runtime is a cross-platform, high-performance scoring engine for Open Neural Network Exchange (ONNX) models.
- [On-Server] Firebase ML Kit: Firebase ML Kit provides ready-to-use APIs for common machine learning tasks and supports custom model deployment.
How to convert third party model to CoreML model (.mlmodel)
Highlighting steps below to quickly onboard your third party model with CoreML
Step#1 Install CoreML python tools provide by Apple here.
# https://apple.github.io/coremltools/docs-guides/source/installing-coremltools.html
pip install coremltools
Step#2 Download and convert third party models. Please refer the links provided in comments, in case you get any error.
Example#1 Converting TensorFlow ‘model’ to CoreML model
# https://apple.github.io/coremltools/docs-guides/source/convert-tensorflow.html
import coremltools as ct
coreml_model = ct.convert(tensor_flow_model)
coreml_model.save("core_ml_model.mlmodel")
Example# 2 Converting Caffe ‘model’ to CoreML model
# https://www.wwt.com/article/convert-a-caffe-model-to-core-ml-format
import coremltools as ct
# You may need .prototxt and class_labels.py depending upon the model
# that you are converting from.
coreml_model = ct.converters.caffe.convert('<path_to_.caffemodel>', '<path_to_.prototxt>')
coreml_model.save("core_ml_model.mlmodel")
Where to go from here?
I would like to provide some extra toppings over this article that you can try to create and train models from scratch using CreateML tool provided by Apple.
Pre-requisite:
1. MacOS Mojave & Xcode 10
which I think most of us already have now a days.
2. Data Set: To train and evaluate image classification model. For image data set you can use ImageNet.
3. Use CreateML tool provided by Apple to create and train your model.
I hope this article has given you a good taste of integrating Machine Learning models in iOS apps and may help you kick start your journey in creating innovative ML based iOS apps.
Give me a 👏🏼 if you like this article.
Attaching complete source code for your reference.
Thank You! 🙏
See you in next learning.