Are you Chinese, Japanese or Korean?

Bibek Chaudhary
4 min readOct 31, 2018

--

Image Classifier based on fast.ai lecture

One of the challenges that I face living in S. Korea is to tell the difference between Chinese, Japanese and Koreans. The similarities in their appearances has led to many awkward moments during my stay.

I wish to avoid those awkward moments by building a classifier that will differentiate between them for me.

First, we need a dataset to train an image classifier. Since I did not found any public dataset for this task, I created my own dataset; images of Chinese, Japanese and Koreans of both gender(and of all ages) were scrapped from the internet. I ended up with a dataset that contained 171 images of Chinese, 168 images of Japanese, and 167 images of Koreans. Samples from the dataset is shown below.

female samples from dataset
male samples from dataset

Now that we have the dataset, we can build a model to train the image classifier. The architecture used to train on this dataset was Resnet50, pre-trained on ImageNet dataset. You can learn more about this architecture in this post.

learner for image classifier

At first, only the last layer of the Resnet50 was trained — freezing the weights of other layers. This accuracy after 20 epochs was around 70%.

Training details for freezed Resnet50

70%? Not bad, huh?

Now let’s fine-tune to see if it makes the model better.

We unfreeze and train all the layers. We will use be using a learning rate of 1e-6 to train the first layer and 1e-5 for the last layer. All the other layers will be trained using the learning rates that fall in the range of [1e-6,1e-5]. The first and other initial layers are trained with smaller learning rate because these layers learn generic features whereas upper and deeper layers learn task-specific features.To learn more about it, read this.

So after unfreezing and training for 20 epochs, the accuracy was around 70%. The performance did not improve with fine-tuning. One reason could be that our dataset is similar to ImageNet, which has human images as well.

Training details for unfreezed Resnet50

Now, let’s interpret the results. We will start by plotting confusion matrix which compares the prediction of image classifier with actual results.

confusion matrix

Let’s take the first column of the matrix and interpret its meaning.

The 17 images of Chinese were correctly predicted as Chinese by the classifier but 5 images of Japanese were incorrectly predicted as Chinese, and 8 Korean images were confused as of Chinese.

This is more clear if we look at samples of the cases that were confused.

samples of confused cases

We can also see the cases in which the classifier was most confused.

most confused cases
The classifier was most confused between Korean and Chinese. It misclassified Korean as Chinese 8 times.

Some Thoughts:

  1. To differentiate between Chinese, Japanese and Koreans is not an easy task even for the AI-trained classifier.
  2. The accuracy of the classifier can be improved if trained longer.
  3. fastai and @jermey will help me to create things that will make my (and hopefully others) life easy.

Code and data for this post can be found in this repo.

--

--