Model format of MPSCNN on iOS (Metal Performance Shaders)

In iOS 10, API to implement and run Convolutional Neural Network (CNN) on iOS devices have been added to the MetalPerformanceShaders (MPS) framework. We can now take advantage of GPU on iOS devices to run fast CNN computation. In other word, we can use outcome of cutting edge deep learning technologies on your device even while offline.

Apple provides samples for the MPSCNN (MPSCNNHelloWorld / MetalImageRecognition), and I integrated it into my sample code collection repo:

Realtime Image Recognition using MPSCNN

In these sample apps, pre-trained model parameters (weight / bias) are used in the app, and those files are named xxxx.dat.

What kind of file is this? Is this something special by Apple? Can MPS CNN use only this format? What format is the content?

In this article, I write about the model format of MPS CNN.

File format

The `xxxx.dat` files which are included in samples such as Apple’s MPSCNNHelloWorld or MetalImageRecognition are ordinary binary files. Thus those are NOT MPSCNN or Apple proprietary / exclusive.

This file can be loaded to memory without any dependencies on Metal or MetalPerformanceShaders as follows:

let fd = open( filePath, O_RDONLY, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH)
assert(fd != -1, “Error: failed to open output file at \””+filePath+”\” errno = \(errno)\n”)
guard let hdr = mmap(nil, size, PROT_READ, MAP_FILE | MAP_SHARED, fd, 0) else {
close(fd)
return nil
}

Then, to use this with MPSCNN,

Casting UnsafeMutableRawPointer to UnsafePointer<Float>,

let p = UnsafePointer<Float>(hdr.assumingMemoryBound(to: Float.self))
assert(p != UnsafePointer<Float>(bitPattern: -1), “mmap failed with errno = \(errno)”)

and pass this pointer to the initializer of MPSCNNConvolution or MPSCNNFullyConnected.

public init(device: MTLDevice, convolutionDescriptor: MPSCNNConvolutionDescriptor, kernelWeights: UnsafePointer<Float>, biasTerms: UnsafePointer<Float>?, flags: MPSCNNConvolutionFlags)

MPSCNNConvolution or MPSCNNFullyConnected copies the parameters when it’s initialized, so you can release the data loaded on memory and close the file.

munmap(hdr, Int(size))
close(fd)

These are also independent of Metal or MetalPerformanceShaders, very generic methods for files or memory.

In conclusion, any file format can be used, as far as it can be read by the iOS app.

Which tools can be used to train the models?

In Apple’s MPSCNNHelloWorld and MetalImageRecognition, they say the parameters are trained using Google’s TensorFlow.

Another article says they used Chainer or Kelas for the training, and ported the parameters to iOS using a file format called HDF5.

Weight Ordering

Although any file format can be used for the parameter files, there is a rule for the content.

Let’s look at the initializer of MPSCNNConvolution again:

public init(device: MTLDevice, convolutionDescriptor: MPSCNNConvolutionDescriptor, kernelWeights: UnsafePointer<Float>, biasTerms: UnsafePointer<Float>?, flags: MPSCNNConvolutionFlags)

The weights in CNN should be a 4D tensor (Input Channels, Kernel Width, Kernel Height, Output Channels), but there is no argument to tell the order of the 4D tensor in the initializer.
 
So, we have to pass the weights 4D tensor with following order:

weight[ outputChannels ][ kernelHeight ][ kernelWidth ][ inputChannels ]

For example, to read it with for loop, the code is like this:

for o in 0..<outputChannels {
for ky in 0..<kernelHeight {
for kx in 0..<kernelWidth {
for i in 0..<inputChannels {
let index = Int(((o * kernelHeight + ky) * kernelWidth + kx) * inputChannels + i)
print(String(format: "\(index): %.3f", kernelWeights[index]))
}
}
}
}

(reference: MPSCNN Weight Ordering — Stack Overflow)

Exporting from TensorFlow

import tensorflow as tf
// …
w_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
// …(train)

Assume that you get trained weights/bias parameters using TensorFlow with above code.

To export to files named “weights_xxx.dat” or “bias_xxx.dat” respectively to use those in MPSCNN, you can do it as follows:

with open('weights_conv1.dat', 'w') as f:
w_conv1_p = tf.transpose(w_conv1, perm=[3, 0, 1, 2])
f.write(session.run(w_conv1_p).tobytes())
with open('bias_conv1.dat', 'w') as f:
f.write(session.run(b_conv1).tobytes())

As for weights for Fully-Connected layer, the code will be as follows:

with open('weights_fc1.dat', 'w') as f:
w_fc1_shp = tf.reshape(w_fc1, [7, 7, 64, 1024])
w_fc1_p = tf.transpose(w_fc1_shp, perm=[3, 0, 1, 2])
f.write(session.run(w_fc1_p).tobytes())The points are:
  • Open the file using `open()` as “write” mode.
  • Re-order the weights 4D tensor.
  • As for a fully connected layer, reshape the 2D tensor to 4D, then transpose it.
  • Write the binary data to the file using `write()` or `tobytes()`.

Wrap up

  • Any file format can be used, as far as it can be read by the iOS app.
  • The order of the trained params’ 4D tensor should be:
weight[ outputChannels ][ kernelHeight ][ kernelWidth ][ inputChannels ]
  • As far as conditions above are satisfied, any tools / libraries can be used such as TensorFlow, Chainer, etc.
Swift Logo Detection using MPSCNN