Build Your Own AI

React Native Image Recognition with CoreML

Published in

Your Virtual Self

5 min readNov 30, 2018

Ever wanted to build image recognition into a React Native project, but had no clue where to start? Been there! I’m going to show you the steps I took to creating an image recognition feature in our app which allows the app to see the face of the user, and determine what their emotion is.

The end result? Here’s a preview, which is by no means a production-ready experience, but you get the idea.

So how do you build this? Let’s go into it step-by-step:

Gather a massive amount of images that will be used to train the CoreML model.
Clean the images and organize them by “classification”.
Use TuriCreate to build the CoreML model.
Use the model in the app.

Quick note: This article is written specifically for use with CoreML, which is an iOS-only solution.

🧰 Necessary tools and resources

How to use TuriCreate: https://medium.com/flawless-app-stories/machine-learning-in-ios-turi-create-and-coreml-5ddce0dc8e26
React Native CoreML Image: https://github.com/jigsawxyz/react-native-coreml-image-example
How to add the model to your app: https://medium.com/jigsaw-xyz/using-our-reactnative-coreml-image-recognition-component-1e88015ebb18
Python script for quick image downloads: https://github.com/hardikvasa/google-images-download
Python script for cropping faces from photos:

mundaneprogramming/mundaneprogramming.github.io

A guide on how to do mundane programming to become a non-mundane programmer …

github.com

Without the help of these libraries, scripts, and tutorials, we would still be trying to figure this stuff out. Huge shout out to the authors for giving us some very helpful tools to make this work.

🐿 STEP ONE: Gather images for training the CoreML model

For those of you who don’t know what CoreML is, head on over here to get caught up. In order to have an accurate machine-learning model, the model needs to “learn” what we want it to correctly classify. In our case, we wanted the CoreML model to be able to look at a streaming/static image of a users face and recognize their emotion by facial expression. In order to accomplish this, we need to get thousands of images depicting each emotion we want to the model to recognize. Don’t worry, there’s a way to do this fairly quickly!

By using this python script, we download images from Google Images straight from our command line. Follow the instructions there for installation and configuration. It will only take a minute. Google does not own the images, they are owned by their respective owners, so be aware of that.

When you have it installed, you should be able to run something similar to this:

➜  maslo git:(coreML) googleimagesdownload --keywords="joy smile person" --limit=2000 --color_type=black-and-white --format=jpg --size=medium --type=face --chromedriver=/Users/me/Desktop/chromedriver

You can see that by putting in keywords, such as “joy, smile, person”, the script searching Google will quickly download 2,000 images to a folder on our computer. For each emotion you want to use, run this script in different folders. It’s a good idea to separate your pictures by emotion, because in a couple steps, we’re going to need them separate.

Another tip: I’ve found it best to go with a medium size, in black and white, with the “type” set to face. Black and white images make it clear to the model that we don’t care about the color of the subject the camera may see, rather, we care about the face of the user. Which brings us to step two.

🧽 STEP TWO: Clean the images

There’s a VERY high chance that when you download pictures of happy people, or people smiling, that you’re going to get a few images that don’t show either of things. I got a lot of animated emoji images, which obviously is nothing like an actual person showing emotion. Make sure to remove these images from your newly downloaded collection of images. The more consistent the images are with each other, the more accurate your model.

In order to reduce confusion for the model on what to focus on, we need to crop all the images to show the face as the primary focus area. This step is important. Without doing this, we run the risk of the model training itself to recognize something else, such as clothing or surroundings. Remember, all we want to train on is facial expressions.

There’s a handy python script found here that can do this for us programmatically. Huge shout out to the creators of the script, it will save you hours of time.

Once you downloaded the script, make sure you are in the directory of the particular emotion you want to crop images in. This script ends up saving all the cropped images in the same directory you run it in. You will end up doing something like this:

➜  maslo git:(coreML) for f in *.jpg; do python3 face_cropper.py $f                                                                                                                                          done

If all executed correctly, you’ll notice that your images have doubled inside the folder you are working in. The cropped images have “crop.jpg” at the end of their file names. Let’s get those images into their own directory. Here’s one way to do that from the command line, otherwise, just “cut and paste” into a new folder:

cp *crop.jpg ~/Users/me/Documents/maslo_images/google-images-download/downloads/joy

Perfect. We now have lots of faces to use to train our model.

🏋️‍♂️ STEP THREE: Use TuriCreate to generate a CoreML Model

For this step, I followed a very well written tutorial, found here. The author, Khoa Pham does an awesome job explaining how everything works, so I relegate to him.

Make sure to read the section about SqueezeNet!!! YOU MUST USE SQUEEZENET. It essentially compresses your model from a would-be ~95MB file size to ~5MB! If we want to use this model in a mobile app, the size must be small.

You should end up with a ~5MB .mlmodel file if you follow his instructions correctly. Next step!

⛵️ STEP FOUR: Use the model in your app

If it ain’t broke, don’t fix it! We found this tutorial to be absolutely perfect for implementing the new CoreML model in our app. Follow their instructions, utilizing your newly created .mlmodel file. Tweak the code in the app to your liking and needs, and you’ll have successfully integrated an image recognition feature into your iOS React Native app! 🚀

So what are you going to build with these resources? You aren’t limited to just facial recognition — what about determining the name of a fruit you see in an image? Or recognizing sports teams by their jerseys? The possibilities are endless. Again, a huge shout out to those cited above! Together, their resources make this process much easier. Till next time!

Maslo is currently in beta. Get it on the App Store.