Image Classifier using Deep Learning

Learn how to classify bears into grizzly, teddy and black

Malika Arora
6 min readSep 25, 2020

Here, we will discuss how to build a basic and simple model of Image Classification using Deep Learning.

Import Packages

from utils import *
from fastai.vision.widgets import *

Gathering Data

To download images with Bing Image Search, sign up at Microsoft for a free account. You will be given a key, which you can copy and enter in a cell as follows (replacing ‘XXX’ with your key and executing it)(For help on getting the key, click here):

key = os.environ.get('AZURE_SEARCH_KEY', 'XXX')

Once you’ve set key, you can use search_images_bing. This function is provided by the small utils class included with the notebooks online. If you're not sure where a function is defined, you can just type it in your notebook to find out:

search_images_bing 

Output:

<function utils.search_images_bing(key, term, min_sz=128)>

search_images_bing : This function returns a list of URLs matching the search terms.

results = search_images_bing(key, ‘grizzly bear’)
ims = results.attrgot(‘content_url’)
len(ims)

Output:

150

This returns 150 indicating that we now have a list of 150 grizzly bear images that Bing Image Search found for us.

Let’s look at one:

dest = 'images/grizzly.jpg'
download_url(ims[0], dest)
dest

Output:

'images/grizzly.jpg'

im = Image.open(dest)
im.to_thumb(128,128)

This downloads one of the images and displays it.

Output:

This seems to have worked nicely, so let’s use fastai’s download_images to download all the URLs for each of our search terms. We'll put each in a separate folder:

bear_types = 'grizzly','black','teddy'
path = Path('bears')

We’ve defined the types of our bears.

if not path.exists():
path.mkdir()
for o in bear_types:
dest = (path/o)
dest.mkdir(exist_ok=True)
results = search_images_bing(key, f'{o} bear')
download_images(dest, urls=results.attrgot('content_url'))

Here, we make a directory named path, search the bears based on each category name, and download the URL of each image.

Our folder has image files, as we’d expect:

fns = get_image_files(path)
fns

Output:

(#417) [Path('bears/black/00000000.jpg'),Path('bears/black/00000001.jpg'),Path('bears/black/00000002.jpg'),Path('bears/black/00000003.jpeg'),Path('bears/black/00000004.jpg'),Path('bears/black/00000005.jpg'),Path('bears/black/00000006.jpg'),Path('bears/black/00000007.jpg'),Path('bears/black/00000009.jpg'),Path('bears/black/00000010.jpg')...]

Often when we download files from the internet, there are a few that are corrupt. Let’s check:

failed = verify_images(fns)
failed

To remove all the failed images, you can use unlink on each of them. Note that, like most fastai functions that return a collection, verify_images returns an object of type L, which includes the map method. This calls the passed function on each element of the collection:

failed.map(Path.unlink);

NOTE: Jupyter has a lot of functionality to help you figure out how to use different functions, or even directly look at their source code. For instance,

In a cell, typing ?func_name and executing will open a window with the signature of the function and a short description.

In a cell, typing ??func_name and executing will open a window with the signature of the function, a short description, and the source code.

From Data to DataLoaders

DataLoaders is a thin class that just stores whatever DataLoader objects you pass to it, and makes them available as train and valid.

First we provide a tuple where we specify what types we want for the independent and dependent variables.The independent variable is the thing we are using to make predictions from, and the dependent variable is our target.

bears = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=Resize(128))
  • ImageBlock is the independent variable i.e. input and CategoryBlock is the dependent variable i.e. the labels(grizzly/black/teddy).
  • get_items gets a list of the image files.
  • RandomSplitter splits the data into Validation Set (20%) and Training Set. Setting seed as 42 helps get the same set of data every time.
  • Usually, x is used as the independent variable and y is used as the dependent variable. Hence, to create the labels, call function y i.e. the categories. parent_label gets the name of the folder a file is in. Because we put each of our bear images into folders based on the type of bear, it gives labels that we need.
  • item_tfms is Item Transforms which is used to resize the image to 128x128.

The following command returns the object of DataLoaders i.e. dls. dls provides batch of few items at a time.

dls = bears.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)

By default Resize crops the images to fit a square shape of the size requested, using the full width or height. This can result in losing some important details. Alternatively, you can ask fastai to pad the images with zeros (black), or squish/stretch them:

bears = bears.new(item_tfms=Resize(128, ResizeMethod.Squish))
dls = bears.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)
bears = bears.new(item_tfms=Resize(128, ResizeMethod.Pad, pad_mode=’zeros’))
dls = bears.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)

These methods are wasteful since the model learns images differently than what they actually are. Instead, what we normally do in practice is to randomly select part of the image, and crop to just that part. On each epoch (which is one complete pass through all of our images in the dataset) we randomly select a different part of each image. This means that our model can learn to focus on, and recognize, different features in our images. It also reflects how images work in the real world: different photos of the same thing may be framed in slightly different ways. For this, we replace Resize with RandomResizedCrop.

Over-fitting: If we train model on the same set of data for too long, model becomes inaccurate.

Note: min_scale=.5 means pick atleast 50% pixels from the image.

bears = bears.new(
item_tfms=RandomResizedCrop(224, min_scale=0.5),
batch_tfms=aug_transforms())
dls = bears.dataloaders(path)

RandomResizedCrop is most common approach since it provides different versions of an image in each epoch. This is called data augmentation.
It prevents over-fitting of data because each time the model sees a different part of the image.

We can now create our Learner and fine-tune it in the usual way:

Fine-tuning: A transfer learning technique where the parameters of a pretrained model are updated by training for additional epochs using a different task to that used for pretraining.

learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)

Here,

  • dls is your data
  • resnet18 is the architecture i.e. the mathematical function you are optimizing where 18 is the number of layers.
  • error_rate gives the percentage that is being incorrectly classified by model

Learner sees what are the best parameters to match architecture with the data.

Metric and loss are closely related. Loss tells the changes with slightest changes in values while metric might not. Metric is the thing you care about. Loss is measurement used by computer as the measurement of performance to decide how to update your parameters.

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

The rows represent all the black, grizzly, and teddy bears in our data set, respectively. The columns represent the images which the model predicted as black, grizzly, and teddy bears, respectively. Therefore, the diagonal of the matrix shows the images which were classified correctly, and the off-diagonal cells represent those which were classified incorrectly.

interp.plot_top_losses(5, nrows=1)

This is a very useful way to check the errors in the model.

Prediction/Actual/Loss/Probability

Note: black/black means the model predicted the bear to be black but was least confident about it.

cleaner = ImageClassifierCleaner(learn)
cleaner

This is used to check the data. We can delete images that are irrelevant to the model or are incorrectly classified.

Incorrect data can be deleted or unlinked.

for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)

Finally,

doc(learn.fine_tune)

Congratulations! You’ve successfully created a model :)

References

https://www.fast.ai

--

--