Intro to Machine Learning on iOS with CreateML & TuriCreate — Part 2

In the first part of this series, we talked about CreateML & built a demo app using it. In this (last) part, we will talk about another ML framework by Apple, called TuriCreate and build another demo app using it.

TuriCreate

Apple acquired Turi, a company specializing in Machine Learning and AI, in August 2016 and later in December of 2017 open-sourced TuriCreate, a framework that allows developers to train custom ML models using Python. These models can then be used in iOS, macOS, watchOS and tvOS apps.

The key differences between CreateML and TuriCreate are:

  • CreateML only works on macOS 10.14+, where as TuriCreate is cross-platform and works on macOS 10.12+, Linux & Windows 10.
  • CreateML requires you to write Swift code in Playground, where as TuriCreate works using Python and can be written in any environment/IDE.
  • CreateML only allows you create custom models to solve a limited set of tasks (such as classification, text recognition and regression), where as TuriCreate allows you to create custom models to solve a variety of tasks not supported by CreateML, such as Style Transfer (similar to how Prisma works), Object Detection and Activity Recognition.
  • CreateML does not allow you to choose the model used for transfer learning (it uses VisionFeaturePrint_Screen), where as TuriCreate allows you to switch between multiple models (VisionFeaturePrint_Screen, Resnet-50 & MobileNet). Unfortunately, both frameworks do not allow you to choose your own model for transfer learning.
  • CreateML allows you to tune an extremely limited set of parameters for a particular ML task, where as TuriCreate gives you slightly more flexibility in terms of tuneable parameters.

To learn more about TuriCreate, watch the WWDC 2018 talk titled “A Guide to Turi Create”:

Demo

We will be training an image recognition classifier to re-create Apple’s Face ID.

Note: It won’t be as secure as Apple’s implementation because of the difference in the type of neural network architecture that is used & because developers cannot access the infrared sensors for 3D depth data.

Before we begin, you need to have Python 2.7, 3.5 or 3.6 installed on your operating system.

In order to start using TuriCreate, the following tools needs to be installed (the installation procedure may differ depending on your operating system):

  1. virtualenv
  2. TuriCreate

virtualenv

To install virtualenv, run the following command in Terminal:

pip install virtualenv

Once it’s installed, create and activate a virtual environment by running the following commands in order:

  • cd ~
  • virtualenv turienv
  • source ~/turienv/bin/activate

TuriCreate

Run the following command within your virtual environment to install TuriCreate:

pip install turicreate==5.0b3

That’s it!

For training our classifier, we will be using the same approach as we took in Part 1 of this series. We need some samples of our face and some samples of things that aren’t our face (like faces of other people or pets, physical objects, etc.).

To collect samples of our face, we can just capture lots of photos in different orientations and angles using the front-facing camera of the iPhone (or whichever phone you have).

To collect samples of other things (like photos of other people’s faces or pets, photos of food, etc), we can get them from ImageNet like we did in Part 1 and then downloading the images using curl.

Once the photos have been captured & downloaded, it needs to be split into two categories/labels — suyash & notsuyash.

We need to first create a folder named Training Data in our virtual environment and inside the Training Data folder, we need to create one folder per category/label — one folder named suyash & and one folder named notsuyash.

  • The folder suyash would contain samples of our face photos
  • The folder notsuyash would contain samples of photos of other things (like photos of other people’s faces, etc).

As discussed in the Part 1 of this series, data augmentation is helpful as the photo taken from the front-facing camera during scanning may have different lighting, exposure, orientation, cropping, etc and we want to account for all the scenarios.

In order to augment our data, we will be using Augmentor, a free Python image augmentation library available on GitHub.

We can use Augmentor to apply random augmentations to our dataset, like rotate, zoom, shear and crop. In order to do that, we can create a Python script called augment-face-dataset.py in our virtual environment and type the following code:

To start the augmentation process, run the following command:

python augment-face-dataset.py

It will augment our target class called suyash with 2500 additional samples containing random augmentations (rotate, zoom, skew, shear and random crop).

We can repeat the process for other classes in our Training Data folder, by changing the path.

To begin training, we need to create a Python file in our virtual environment, called create-face-recognition-model.py and type in the following code:

  1. Load the images from the Training Data folder.
  2. Create target labels/classes from the folder names i.e suyash & notsuyash.
  3. Train the model using the resnet-50 model (for transfer learning) and 100 iterations.
  4. Export the trained model to a .mlmodel (CoreML) file.

To start training, run the following command into your virtual environment:

python create-face-recognition-model.py

Starting the training process

Your model will now begin training. Once finished, the created model will be saved with the file name FaceRecognition.mlmodel.

Training the model

I have created an iOS app called “TabID”, that uses ARKit to detect blinking — to ensure our app cannot be fooled by a 2D photograph. Once a blink has been detected, it captures a screenshot of the live ARKit view (hidden from the user) and passes it to our model for classification.

The app also uses the LocalAuthenticationPrivateUI private framework to extract the LAUIPearlGlyphView view, which is what FaceID uses to display those cool animations when authenticating.

To begin, we need to first import the model into the Xcode project, which can be done by simply dragging & dropping the FaceRecognition.mlmodel file into the project:

Xcode automatically creates a class for the model (depending on its name). In our case, it’s called FaceRecognition.swift.

We will be using the Vision framework along with CoreML rather than using CoreML directly, as it lets us pass an UIImage directly rather than requiring us to first convert it to a CVPixelBuffer. The image is then used by CoreML for classification, using our model.

In order to perform face recognition, we need some methods. First, we need a method to create a request for the Vision framework:

Here, we define two methods:

  1. createClassificationRequest which creates a VNCoreMLRequest, which we can pass to the Vision framework for classification using CoreML. Once finished, it will trigger our handler method (called handleClassification) with the results
  2. createRequest, which is a method we can extend to support various types of requests. For example, in the future we may use Image Similarity instead of Image Classification for facial recognition, but would like to retain the ability to switch between the two depending on some configuration.

Now, we can use the request to perform the actual classification:

Here, we create a request (based on the type i.e the app supports .classification for now) & pass it to VNImageRequestHandler with the image, which will perform the classification using CoreML and trigger our handler method called handleClassification (as described in the createClassificationRequest method).

Once the request has been performed, the handler method (called handleClassification) will be called with the original request, now containing the results (if any) and an error value (if any).

We can now retrieve the classification label for the image and check if the classification label is suyash. If it is, we will perform a successful authentication animation, otherwise we will transition to a failed state and let the user re-try the scanning (as many times as possible, but we could limit this to 3 and ask for a password, similar to how FaceID works).

Result

Unsuccessful Authentication (left) and Successful Authentication (right)

TuriCreate is a fantastic tool that allows developers across all major platforms (not just macOS) to create their own custom ML models. One major difference between TuriCreate and CreateML is that it allows you to create custom models for a variety of ML tasks which aren’t supported by CreateML.

However, it has some of the same limitations as CreateML, such as:

  • You cannot train a model from scratch for image classification. TuriCreate uses “transfer learning” to retrain an existing image classification model. However, you do get some flexibility in terms of which model you want to use to “transfer” the learning to — Apple’s default model (VisionFeaturePrint_Screen), Resnet-50 or MobileNet.
  • You cannot design your own CNN (Convolutional Neural Network) architecture using it or modify any of the hyper-parameters of the CNN architecture used by the base models, such as learning rate, batch size, activation function and so on.
  • You have limited flexiblity in terms of designing the architecture of any kind of model or fine tuning its parameters.

If you’re looking to incorporate ML features into your app with the fewest steps possible, then CreateML and TuriCreate are for you. If you want to do something more sophisticated or fancy, then you would need to use a more advanced library, such as Keras or TensorFlow.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store