Project Introduction

Lucrece (Jahyun) Shin
4 min readAug 17, 2021

--

In my previous post, I wrote about my journey from being a fresh mechanical engineering graduate to starting a masters degree in machine learning at University of Toronto. In this post, I would like to share my process of getting started with my deep learning/computer vision research project in an academic research setting.

Late August - Early September 2020

Getting Started

During the first round of emails with my research supervising professor, he gave me the following brief description about the project:

Image classification. Transfer Learning. Did these terms sound familiar to me? Kind of. While completing Udacity’s Deep Learning Nanodegree program I talked about in my previous post, I’ve done several transfer learning projects for image classification where I downloaded a model with weights pre-trained with the 1000-class ImageNet dataset (from PyTorch libraries) and fine-tuned it using my own image dataset with fewer classes. Those were types of hello-world projects for deep learning application in computer vision.

During our first Skype call (due to covid19), he added that the image dataset for the project is composed of Xray images of baggages from an international airport’s Xray security scanner. My task would be to develop a deep learning model that can identify baggages that contain dangerous items, a task that is currently performed by human operators at the airport.

He shared with me a dataset folder containing the Xray baggage images. A sample image looked like this :

An Xray baggage scan image containing a knife

He explained that the international airport associated with this project provided only a small number of Xray baggage scan images per each class of dangerous objects, which may not be sufficient to train a deep learning model which usually performs robustly with a large amount of data. For that reason, he wanted me to try a technique called Domain Adaptation, which would involve first collecting a large number of publicly-available normal (non-Xray) images of dangerous objects (e.g. from Google image search), then using only those normal images (no Xray images) to train a model to detect the dangerous objects. I would then find a way to somehow adapt the trained model to perform the same job well on the Xray images.

From my brief research about Domain Adaptation after the Skype call, Wikipedia gave the following definition:

“Domain Adaptation is the ability to apply an algorithm trained with a source domain to a different target domain”. (wikipedia)

A normal image containing a knife

Some previous students who worked on this project gave me their Google colaboratory notebook. They have worked on the same Xray dataset but using transfer learning without domain adaptation, meaning that they used the small amount of Xray images as training images. From their notebook, I found a table showing an astonishing performance of their model :

Recall Table: The left column lists the classes of dangerous items and the right column shows the model’s recall for detecting each item.

When I saw this, I doubted if there was anything to do to improve such high recalls for each class. After taking a closer look at the Xray image dataset given by the airport; however, I found that the dataset contained many duplicates of the same image that were rotations of each other. So if training/validation/test set were split among the duplicates, there could have been a data leakage between the three partitions. For example, if the test set contained rotations of images from the training set, it’s not surprising that the model will perform well on the test set since it has already seen those test images. Considering that there were only a small amount of Xray images to begin with (which could easily overfit the model), also having data leakage would have made the results seem good but prevented the model from generalizing well to unseen data. So I dived right into more closely inspecting the dataset, which I will talk about in my next post.

At this point, I gave the following title to the project :

Automatic Threat Detection in Airport Xray Security Imaging using Transfer Learning and Deep Domain Adaptation

In my future posts, I will share my 12-month research journey for the Automatic Threat Detection Project, divided into the following 9 Chapters :

  1. Data Inspection/Pre-processing for Xray images
  2. Iterative Data Collection for Source Domain
  3. Transfer Learning with ResNet50 I: Dataloaders to Training
  4. Transfer Learning with ResNet50 II: Performance Analysis to Unexpected Riddle
  5. t-SNE plots as human-machine translator
  6. Optimizing Data for Flexible Image Recognition
  7. Debugging black box of CNNs using feature visualizations
  8. Adversarial Discriminative Domain Adaptation (ADDA)
  9. Transfer Learning with Vision Transformer (ViT)

Thanks for reading! ♥️

--

--