The challenges of machine learning on iOS

hartator
4 min readJun 8, 2016

--

I’ve just published an iOS app that heavily uses machine learning. It tries with more or less success to estimate how much calories from just a food picture. It looks something like this:

Final version of the app, sorry for the poor gif quality.

I’ll try to explain in this post the challenges encountered while developing this app.

First, I wanted to use TensorFlow. I had already learnt TensorFlow at the time for another project. It seems the way to go solution people were using. And it was growing fast. Indeed Google backing, for the best or the worst, in addition of poaching the best people in the machine learning open source community had made TensorFlow a no brainer if you want to start a new machine learning project. Unfortunately, TensorFlow still has short coming. Because of its decentralized nature, TensorFlow is 3x or 4x slower than other frameworks and is a bit more hungry for RAM. Both CPU and RAM are scarce resources on an iPhone. Finally, no one seems to have yet managed to make TensorFlow run on iOS.

I’ve decided then to learn Caffe. Caffe is an awesome machine learning framework. It’s developed by a team of searchers from the Berkeley Vision and Learning Center. It also specializes in computer vision using convolutional neural network. That seemed a good fit for this project. Some members of the Caffe community seemed also to have already managed to make some parts of Caffe run on Xcode. It was a promising start. The less promising part was that one of the developers of Caffe has been hired by Google to work TensorFlow. That will make TensorFlow prominent in the future. However, Caffe models and TensorFlow models are pretty similar. I’ve found actually already tools that allow you to convert Caffe models to TensorFlow models with little work involved. A potential transition in the future should be easy.

Unfortunately, Caffe wasn’t playing well with Xcode 7.x. Some parts needed to be tweaked. Moreover, on iPhones I have only 1GB of RAM to play with. Some Caffe models inflate up to 1.x GB of Ram when they are running. It seems to also depends on the computation. The computation differs for different pictures. So its memory usage differs as well. It makes some bugs difficult to catch. Plus, when you have just at the limit of the iPhone RAM, some times it will work, some times it won’t. For the actual same picture…Memory usage management is a big issue when you want to make convolutional neural networks run on iOS.

The app technical architecture is still pretty minimal. I’ve been running only 2 models. One for identifying the food. One for giving a scale if the food is greasy or lean. Coupled to average calorie database, a small algorithm is then able to do basic calorie estimation. The model to identify food can identify between 97 different foods. The greasiness/leanness model is still pretty raw. It has only been trained only on around 500 images.

Caffe model being evaluated natively on iOS

Xcode compilation times were an issue. I’ve managed to cut down the compilation times by forcing some optimizations and caching. Nevertheless and oddly, one of ten times, Xcode recompiles everything. It would usually take a few minutes. It can be extremely frustrating.

There is obvious paths of improvements.

First, I need to identify and expand our database to more foods. 300–400 types of food will allow the app to be more precise. Having more rated images of greasy and lean food will be awesome as well. Adding a model to measure precise quantity seems to be necessary as well, but it’s harder to make. Food quantity need to be trained against the different kind of food when greasiness is more easy to evaluate. Last but not least, right now the app can only work with one food at the time. Caffe on iOS is pretty slow. It’s indeed using only the CPU. It doesn’t use the GPU or the metal API. It would be faster to use those. Having it running faster will allow to identify multiple food elements by picture. Without having the user to wait forever. This changes will allow to add more precision to the app. Probably next version.

--

--

hartator

Passion for beautiful code, lunatic enterprises and ludicrous dreams