Activity Classification with Create ML, CoreML3, and Skafos: Part 1

7 min readSep 27, 2019

Data Preparation and Model Training

This tutorial leverages Create ML, now available as part of Xcode on macOS 10.15, Catalina.

Your iPhone is more powerful than you probably realize. How is that even possible?

When Apple announced improvements to the Core Motion Framework at the 2016 WWDC, it sparked a wave of data-driven applications centered around health, fitness, and other motion-related tasks. Core Motion reports accelerometer, gyroscope, and other sensor readings from the on-device hardware buried in iPhones, iPads, and Apple Watches.

With a means to obtain rich motion data on-device, Apple’s release of CreateML, including the anticipated activity classification toolkit, brings new capabilities to developers. For example, building an app that tracks your distinct exercises during a workout or counts the number of frisbee throws you make in a game is now easier than ever.

This is part 1 of a two-part series on activity classification at the edge. In this post, we will prepare sensor data obtained from Apple watches and train an activity classifier model in Create ML.

Gathering Training Data

Building an activity classifier model requires labeled sensor data. On a mobile device, two commonly used sensors are the accelerometer, tracking (x, y, z) acceleration, and gyroscope, tracking rotation around the (x, y, z) axes. Below is an example of this kind of data captured while walking for 3 seconds:

Fig 2. https://github.com/apple/turicreate/raw/master/userguide/activity_classifier/images/walking.png

Depending on the use-case, gathering this type of data is manageable thanks to some apps like SensorLog that allow you to collect, label, and export. In a previous blog post, we walked through the process of collecting data through the app with a few volunteers wearing Apple watches. The data contains 3 basic activities: sitting, standing, moving.

For this example, our goal is to use the previously collected data to build a model in Create ML that can recognize and differentiate these activities/motions from one another.

Data Download

Download the raw CSV dataset here.

If you want to skip over the data cleaning section, download a zip archive of the fully-prepared dataset here. You will be able to use the data instantly in Create ML.

If you want to see how to prepare the raw data for Create ML, keep reading!

Data Cleaning and Prep for Create ML

First, we need to break up the large CSV file into training + testing sets. Also, we need to break the data into subfolders where each folder name corresponds to one of the activities mentioned above. Inside each activity-specific folder, we will write out chunks of data broken up by session_id. A single session is one continuous data collection period by a single volunteer.

Open a Python 3.6+ console in the same directory as the watch_activity_data.csv file and run the following code:

import os
import turicreate as tc# Column name mappings
cols = {
    "motionRotationRateX(rad/s)": "rotation_x",
    "motionRotationRateY(rad/s)": "rotation_y",
    "motionRotationRateZ(rad/s)": "rotation_z",
    "motionUserAccelerationX(G)": "acceleration_x",
    "motionUserAccelerationY(G)": "acceleration_y",
    "motionUserAccelerationZ(G)": "acceleration_z",
    "sessionId": "session_id",
    "activity": "activity"
}
csv_cols = ["rotation_x", "rotation_y", "rotation_z", "acceleration_x", "acceleration_y", "acceleration_z"]# Load csv data and rename columns
sf = tc.SFrame.read_csv("watch_activity_data.csv")[list(cols.keys())].rename(cols)# Remove missing activies
sf = sf[sf['activity'] != '']
acts = sf['activity'].unique()# Split data into training and testing sets
train, test = tc.activity_classifier.util.random_split_by_session(sf, session_id='session_id', fraction=0.8)# Write out training data
path = "train/"
os.mkdir(path)for a in acts:
  # Check for folder
  cls_path = path + a + "/"
  if not os.path.exists(cls_path):
    os.mkdir(cls_path)# Split data by activity & session and write to file
  sf_act = train[train['activity'] == a]
  for s in sf_act['session_id'].unique():
    sf_act[sf_act['session_id'] == s][csv_cols].save(cls_path + str(s) + ".csv")# Write out testing data
path = "test/"
os.mkdir(path)for a in acts:
  # Check for folder
  cls_path = path + a + "/"
  if not os.path.exists(cls_path):
    os.mkdir(cls_path)# Split data by activity & session and write to file
  sf_act = test[test['activity'] == a]
  for s in sf_act['session_id'].unique():
    sf_act[sf_act['session_id'] == s][csv_cols].save(cls_path + str(s) + ".csv")

Afterward, you should have a file structure that looks like this:

Fig 3. Data file structure for Create ML.

The file train/sitting/9.csv is sensor data in the training set, recorded while sitting, from one volunteer during a single session. You would also have this structure if you skipped ahead and downloaded/unzipped the pre-processed dataset. Now we are ready for model training in Create ML.

Training in Create ML

Open the Create ML application and select the Activity Classifier from the templates menu. Fill in a Project Name, Author, License, and Description as you see fit.

After hitting “Next”, you will be shown an interface with several panels and options. First, select the data inputs. Select folders for both training data and testing data and select all 6 feature columns (autodetected by Create ML): acceleration_x, acceleration_y, acceleration_z, rotation_x, rotation_y, rotation_z. This will launch a Finder window where you can select the folders created earlier: train/ and test/.

Before we can start training, we have to define a few parameters:

Maximum Iterations and Batch Size (30 and 64 respectively), were determined after some trial and error. If you are unsure, you can start with these values and adjust as needed. Sample Rate refers to the number of observations recorded per second by the on-device hardware. The training data was collected at 10Hz, so we set the sample rate to 10. Prediction Window Size is the number of samples the model will use to make a classification. With a value of 30 as shown above, the model uses 30 samples (3 seconds of data) to classify an activity.

Now it’s time to train the Activity Classifier model! Hit the Train “play button” in the top left corner of the app. Be prepared to wait several minutes while this model trains. You should see progress displayed under the “Activity” panel like this:

Fig 8. Activity Classifier training accuracy progress.

What’s Happening?

Create ML abstracts all training details from the user. Therefore, it’s tough to infer which algorithm is used under-the-hood. Because Turi Create (TC) and Create ML are both flagship machine learning tools from Apple, looking to Turi Create for more details is a good place to start. Further, the model parameters presented by Create ML are equivalent to those supported by the Turi Create library. As a part of their toolkit, TC wrote an excellent user guide on time-series classification problems handled by deep learning models. Read the guide to learn more about how activity classifier models are constructed and get a better sense of what Create ML is doing under-the-hood.

Testing the Model

Create ML provides a few ways for us to evaluate the quality of the model we just trained.

As displayed in Fig 8 above, the report of training accuracy over time indicates that the model steadily improved on training data from start to finish. However, this is only somewhat useful since what we are curious about is how the model performs on holdout data.

Fig 9. Activity Classifier testing metrics.

The true test of model performance is on holdout testing data! After training, Create ML automatically computes precision and recall metrics on the testing dataset we selected (shown in Fig 9).

In Figure 9, we observe that the model is pretty good at differentiating movement from the stationary activities: sitting and standing. This seems reasonable and suggests that the model wasn’t able to pick up enough information to distinguish sitting from standing. Further tuning of model parameters like Prediction Window Size, or higher volumes of training data could alleviate this gap in performance.

Evaluating on External Data

Create ML also provides an interface to live-test the model. Give it a shot by clicking on the CoreML artifact in the upper right corner. It will take you to a screen that looks like this, where you can drag in some sensor data with the same features used during training.