How to create a machine learning model from scratch with CreateML

Published in

Apple Developer Academy | Federico II

5 min readApr 3, 2020

What is Machine Learning?

Machine Learning is very exciting. It allows you do amazing things like convert speech to text or text to speech, recognize language or grammatical structure, detect faces in photos and track moving objects in videos. These are only a few of the several things that are possible today, thanks to the “magic” of machine learning.

Many developers who have never used machine learning think that that without a strong background in mathematics, it’s complicated or difficult to understand and implement in apps.
Those people don’t know how powerful and easy to use are CreateML, CoreML,CoreMotion and Vision can be.
With this set of tools and APIs you can easily add the magic of machine learning into your apps!

CreateML

This tutorial will show you how to use the CreateML app to build an image classifier model that can be used to differentiate between cars, motorbikes and bikes.

Before WWDC 2018, iOS developers could only use CoreML with pre-trained models. If they wanted to create their own model, they had to rely on third party servicesand an understanding of other programming languages like Python.
When Apple introduced CreateML in 2018 it opened the possibility of machine learning to the Apple ecosystem.
In CreateML you can select different templates: Image Classifier, Object Detector, Sound Classifier, Activity Classifier and more.
We will use the first one which has been trained to recognize images.

Creating a model

1. Prepare your image dataset.

To create a model, you need many images for each classification (M L is 80% training data and 20% testing data) and a simple folder structure. In our case, we used this dataset with some images of cars, motorbikes and bikes.

You must structure your training folder content to have as many folders as the number of classes you want to recognize. Each of the folders must be named the same as its class. In our case we have three folders, one for each of our classes. All this is because CreateML image classifier is a model trained according to supervised learning.

Supervised learning is a machine learning algorithm which learns using a function that maps an input to an output based on the example input-output pairs. It infers a function from labeled training datacontains of a set of training examples.

If you want to create your own dataset, the key is the quantity and the difference: the more varried the images are that you provide, the more accurate your model will be. You should follow the same training-test proportion (80/20).

2. Training and validating.

Open Spotlight, search Create ML and open it. CreateML is a tool integrated in Xcode.

Now create a New Document with a title.

Select the Image Classifier template and click Next.

You need only to drag&drop your train folder into Data input - Training section.

In Validation Data choose Automatic. When you select this option, CreateML will use a portion of the training examples (5%) to evaluate how well the model operates during training, this way it can be adjusted based on the result. For better validation you can also use a folder with different images from the training ones.

You will end up with 3 classes for each object: cars, motorbikes and bikes. In Maximum Iterations parameter you should use the default option (25). The more iterations you choose, the more time you will need to create the model and the more accurate your model will be. If you do 25 iterations, the model will have seen each training image 25 times.

Augmentations parameters are very useful for your model. With these parameters your model will learn your images in different cases also, i.e. a rotated image. We chose only crop to have different sizes for all the images.

Click Train and wait (with a cup of coffee ☕️).

When training is finished, you can see a graph with your accuracy during all of the iterations.

On the bottom you will have the precision and recall for each class.

100% precision for bicycle class means that of all the images the model predicted to be “bicycle”, were all bicycles. 100% recall for bicycle class means that the model found all of the bicycles within the total number of bicycles.

3. Test your model inside CreateML

Now you have your model. If you click on it, a new folder appears and you can drag images inside to test the model. Here we drag a motorbike image and the model responds successfully with 100% confidence metric.