Leverage Scikit-Learn Models with Core ML


This post discusses how to implement Apple’s new Core ML platform within DSX, which was announced a few days ago at WWDC 2017. Core ML is a platform that allows integration of powerful pre-trained models into iOS and macOS applications.

Core ML comes with two main benefits: efficiency and privacy. Core ML has been specifically engineered for on-device performance. Having a pre-trained model accessible on your device removes a network connection requirement and ensures privacy for users.

But the best thing about Core ML is that you can continue to use your favorite machine learning libraries in Python, and easily convert your pre-trained models to Core ML objects that can be exported to your iOS and macOS application development.

The conversion to Core ML objects from libraries such as Keras, sklearn, LibSVM, and others is supported out-of-the-box in Data Science Experience.


You can install coremltools via pip, which can be called from within a notebook in DSx. It's important to note that Core ML supports Python 2.7 only.

!pip install -U coremltools

Create a linear model with scikit-learn

First, create some data using numpy, a library for computing with Python. We’ll create a very simple model because the focus of this short guide is converting a scikit-learn object to a Core ML model.

import numpy as np
x_values = np.linspace(-2.25,2.25,300)
y_values = np.array([np.sin(x) + np.random.randn()*.25 for x in x_values])

Now that we’ve got our data, we’ll perform a linear regression.

from sklearn.linear_model import LinearRegression
lm = LinearRegression().fit(x_values.reshape(-1,1), y_values)

Create a Core ML model

Core ML supports many kinds of machine learning models in addition to linear models, including neural networks, tree-based models, and more. The Core ML model format supports the .mlmodel file extension. We'll show how to instantiate an MLModel using this kind of file. The aim is to painlessly transition from an sklearn object to a Core ML model.

from coremltools.converters import sklearn
coreml_model = sklearn.convert(lm) print(type(coreml_model))
<class 'coremltools.models.model.MLModel'>

Now coreml_model is our Core ML object.

The MLModel class has a few attributes and methods. Metadata contains information about the origin, author, inputs and outputs, among other things. Let's see how this works.

coreml_model.author = "DSX" print(coreml_model.author)

We can add other metadata as we please. The list of attributes includes:

  • author: The author of the model.
  • input_description: The descriptions of the inputs. This can include information about the data types, number of features, and more. In our example, we have a single input, a real valued number.
  • output_description: A description of the output.
  • short_description: A comment on the purpose of the model.
  • user_defined_metadata: Anything you like!
coreml_model.short_description = "I approximate a sine curve with a linear model!"
coreml_model.input_description["input"] = "a real number"
coreml_model.output_description["prediction"] = "a real number" print(coreml_model.short_description)
I approximate a sine curve with a linear model!

At this point you have a tuned and labeled CoreML object. The goal is to seamlessly integrate this into the existing workflow of an iOS/macOS application developer who needs your machine learning models. Saving the model to local storage is very easy using coremltools:


We can also create an MLModel object using a .mlmodel file.

from coremltools.models import MLModel
loaded_model = MLModel('linear_model.mlmodel') print(loaded_model.short_description)
I approximate a sine curve with a linear model!

Save Your Model

An application developer can access your trained model with Object Storage using IBM Bluemix. You will need your Bluemix credentials to link to Object Storage, which can be generated from the data assets tab in your notebook:

You need to have some files in your data assets for this screen to be visible!

The cell below shows the code generated from this process:

credentials_1 = { 'auth_url':'https://identity.open.softlayer.com',
'project':'object_storage_9-----3', 'project_id':'7babac2********e0', 'region':'dallas', 'user_id':'9603b8************70f', 'domain_id':'2c66d***********b9d26', 'domain_name':'1026***', 'username':'member_******************', 'password':"""***************""", 'container':'TemplateNotebooks', 'tenantId':'undefined', 'filename':'2001.csv' }

Don’t worry about the filename in this credentials dictionary, as we will define a function put_file that will use the important security credentials generated above along with the local mlmodel file to send it to Object Storage.

from io import BytesIO
import requests
import json
def put_file(credentials, local_file_name):
"""This functions returns a StringIO object containing the file
content from Bluemix Object Storage V3."""
f = open(local_file_name,'r')
my_data = f.read()
url1 = ''.join(['https://identity.open.softlayer.com',
data = {'auth': {'identity': {'methods': ['password'],
'password': {'user': {'name': credentials['username'],
'domain': {'id': credentials['domain_id']},
'password': credentials['password']}}}}}
headers1 = {'Content-Type': 'application/json'}
resp1 = requests.post(url=url1, data=json.dumps(data),
resp1_body = resp1.json()
for e1 in resp1_body['token']['catalog']:
for e2 in e1['endpoints']:
if(e2['interface']=='public' and
url2 = ''.join([e2['url'],'/',
credentials['container'], '/', local_file_name])
s_subject_token = resp1.headers['x-subject-token']
headers2 = {'X-Auth-Token': s_subject_token, 'accept':
resp2 = requests.put(url=url2, headers=headers2, data = my_data)
print resp2

Calling put_file with your credentials and linear_model.mlmodel as the local filename will send your Core ML model into Object Storage. It is now available for the iOS/macOS application developer to access through Bluemix. You can find documentation on retrieving assets from Object Storage here.

Now you can convert pre-trained machine learning models that you made in DSX and provide them to a software developer for use in iOS and macOS applications.

Here is a link to the notebook in DSx where we ran this code. Please don’t hesitate to contact myself or Adam Massachi if you have any questions!

Originally published at datascience.ibm.com on June 9, 2017.