In a previous article on machine learning (ML) for mobile, I introduced the topic, explained its importance (why you should care) and offered a few tips for getting started. If you missed it, I recommend reading that one first as it sets the stage for what we’ll address today: Core ML
When we talk about machine learning on a mobile device, it’s about the inference or prediction of user data using a pre-trained machine learning model.
For machine learning inference, iOS developers have three choices for accessing trained models to provide inference:
1. Use Core ML to access a local on-device pre-trained model. This is today’s topic!
2. Host a Machine Learning Model in the cloud and send data from the device to the hosted endpoint to provide predictions.
3. Call third-party API-Driven Machine Learning cloud managed services where the service hosts and manages a pre-defined trained model. User data is passed through an API call from the device and the service returns the predicted values.
What is Core ML?
Core ML is the machine learning framework used across Apple products (macOS, iOS, watchOS, and tvOS) for performing fast prediction or inference with easy integration of pre-trained machine learning models on the edge, which allows you to perform real-time predictions of live images or video on the device.
[Advantages of ML on the edge]
Low Latency and Near Real-Time Results: You don’t need to make a network API call by sending the data and then waiting for a response. This can be critical for applications such as video processing of successive frames from the on-device camera.
Availability (Offline), Privacy, and Compelling Cost as the application runs without network connection, no API calls, and the data never leaves the device. Imagine using your mobile device to identify historic tiles while in the subway, catalog private vacation photos while in airplane mode, or detect poisonous plants while in the wilderness.
[Disadvantages of ML on the edge]
Application Size: By adding the model to the device, you’re increasing the size of the app and some accurate models can be quite large.
System Utilization: Prediction and inference on the mobile device involves lots of computation, which increases battery drain. Older devices may struggle to provide real-time predictions.
Model Training: In most cases, the model on the device must be continually trained outside of the device with new user data. Once the model is retrained, the app will need to be updated with the new model, and depending on the size of the model, this could strain network transfer for the user. Refer back to the application size challenge listed above, and now we have a potential user experience problem.
What Can You Do with Core ML?
Real-Time Image Recognition, Face Detection, Text Prediction, and Speaker Identification represent some of the many innovations made possible with Machine Learning using Core ML.
How Does it Work?
Core ML uses a machine learning model that is pre-trained in the cloud then converted to Core ML format and added directly to your Xcode project.
What is Model Training?
Machine learning model training involves providing a ML algorithm with training data to learn from. The machine learning model is an artifact of the training and the type of model depends on what you want to predict. In the image below, labeled images of flowers are sent to the algorithm(s) to learn and classify those flowers. The model is trained on the data to later predict a type of flower when presented by the app user on the device.
In addition to training models, there are some ready-to-use Core ML models Apple has provided with a description of each based on what it is capable of predicting. https://developer.apple.com/machine-learning/
Frameworks/Tools Supporting Core ML
Machine Learning on iOS is really a system of tools and frameworks with Core ML as the “core.” Core ML is tightly integrated and supports the Vision (image analysis), Natural Language Processing, and GameplayKit frameworks.
The Vision framework performs face and landmark detection, text detection, barcode recognition, image registration, and feature tracking. When using Core ML to solve computer vision problems like object classification and object detection, use the Vision framework as Apple has made it super simple to use Vision as the pipeline from the mobile to Core ML. Vision does all the heavy lifting from AVFoundation to CVPixelBuffer format that Core ML expects. Take advantage of it.
The NLP API offers Natural Language Processing and Speech Recognition. It uses machine learning to deeply understand text using features such as language identification, tokenization, and detecting names from text.
For game developers, the GameplayKit is for architecting and organizing your game logic. It incorporates common gameplay behaviors such as random number generation, artificial intelligence, pathfinding, and agent behavior.
Core ML is a very powerful tool and integrates nicely with the other machine learning frameworks from Apple. I predict this is just the beginning for machine learning on the device and we’ll soon see more announcements coming. Next time, we’ll dive deeper into using Core ML to solve a common supervised machine learning technique called object classification.