Deep Learning for bear image classification Using PyTorch & Fastai & DuckDuckGo API

6 min readAug 8, 2022

Let’s install environment in google colab.

Check my ipynb file here!

#hide
! [ -e /content ] && pip install -Uqq fastbook
import fastbook
fastbook.setup_book()#hide
from fastbook import *
from fastai.vision.widgets import *

Data Gathering through DuckDuckGo API

bear_types = 'grizzly','black','teddy' # define types of bear that we'd like to download the images
path = Path('bears')
if not path.exists():
  path.mkdir()  # create a directory to save downloaded images
  for o in bear_types:
    dest = (path/o)
    dest.mkdir(exist_ok=True)
    urls = search_images_ddg(f' {o} bear') # search images through DuckDuckGo
    download_images(dest, urls=urls) # download all the URLs for each of our search terms. We'll put each in a separate folderfns = get_image_files(path) # 
fns(#568) [Path('bears/teddy/f97da50d-4f1a-4a62-ad93-0575d8fe92ae.jpeg'),Path('bears/teddy/669d7bba-02a8-411e-b70f-b8bd82628121.jpg'),Path('bears/teddy/af9a2a15-07e5-4a3b-a202-e89856d7d42a.jpg'),Path('bears/teddy/22e4d7e7-fb15-4244-9ebd-d10f1af97624.jpg'),Path('bears/teddy/f2191ec9-fc7c-43f4-b9bf-57cdbb703354.jpg'),Path('bears/teddy/473cc5d8-11cb-40aa-b478-e054edfda62e.jpeg'),Path('bears/teddy/4aa392b2-c12b-4c23-8cce-67bebfdd7d80.jpg'),Path('bears/teddy/525f957c-446c-4308-97cb-ad052405d51a.jpg'),Path('bears/teddy/3c6208c5-20cc-49bf-a428-8703fc9af235.jpg'),Path('bears/teddy/91282a54-eafa-44e0-82bb-a610d8625141.jpg')...]# Often when we download files from the internet, there are a few that are corrupt. Let's check:
failed = verify_images(fns)# remove all the failed images
failed.map(Path.unlink)(#0) []

From Data to Dataloaders

bears = DataBlock(
    # provide a tuple where we specify what types we want for the independent and dependent variables:
    blocks=(ImageBlock, CategoryBlock), # What kinds of data we are working with
    get_items=get_image_files, # How to get the list of items
    splitter=RandomSplitter(valid_pct=0.2, seed=42), # How to create the validation set
    get_y=parent_label, # How to label these items, gets the name of the folder a file is in
    item_tfms=Resize(128)) # Transformation: picture resize
dls = bears.dataloaders(path) # the path where the images can be founddls.train.show_batch(max_n=4, nrows=1) # check training set
dls.valid.show_batch(max_n=4, nrows=1) # check validation set
dls.show_batch(max_n=4, nrows=1) # check data

Different types of image transformation

Resize

1. Resize default

By default Resize crops the images to fit a square shape of the size requested, using the full width or height. This can result in losing some important details.

2. Pad the images with zeros (black)

bears = bears.new(item_tfms=Resize(128, ResizeMethod.Pad, pad_mode='zeros')

3. Squish/stretch the images:

bears = bears.new(item_tfms=Resize(128, ResizeMethod.Squish)

Instead, what we normally do in practice is to randomly select part of the image, and crop to just that part. On each epoch (which is one complete pass through all of our images in the dataset) we randomly select a different part of each image. This means that our model can learn to focus on, and recognize, different features in our images. It also reflects how images work in the real world: different photos of the same thing may be framed in slightly different ways.

RandomResizedCrop

item_tfms=RandomResizedCrop(128, min_scale=0.3)

Data augmentation

batch_tfms=aug_transforms(mult=2)

# Default
dls.valid.show_batch(max_n=4, nrows=1)# Pad
bears_pad = bears.new(item_tfms=Resize(128, ResizeMethod.Pad, pad_mode='zeros'))
dls_pad = bears_pad.dataloaders(path)
dls_pad.valid.show_batch(max_n=4, nrows=1)# Squish
bears_squish = bears.new(item_tfms=Resize(128, ResizeMethod.Squish))
dls_squish = bears_squish.dataloaders(path)
dls_squish.valid.show_batch(max_n=4, nrows=1)# RandomResizedCrop

# min_scale: determines how much of the image to select at minimum each time:
bears_random = bears.new(item_tfms=RandomResizedCrop(128, min_scale=0.3))
dls_random = bears_random.dataloaders(path)
dls_random.train.show_batch(max_n=4, nrows=1, unique=True)
# unique=True to have the same image repeated with different versions of this RandomResizedCrop transform.# Data Augmentation
bears_aug = bears.new(item_tfms=Resize(128), batch_tfms=aug_transforms(mult=2))
dls_aug = bears_aug.dataloaders(path)
dls_aug.train.show_batch(max_n=8, nrows=2, unique=True)

Tain the model and use it to clean your data

bears = bears.new(
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms())
dls = bears.dataloaders(path)learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)

Error check

1. confusion matrix

let's see whether the mistakes the model is making are mainly thinking that grizzlies are teddies (that would be bad for safety!), or that grizzlies are black bears, or something else.

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

2. plot_top_losses

It's helpful to see where exactly our errors are occurring, to see whether they're due to a dataset problem (e.g., images that aren't bears at all, or are labeled incorrectly, etc.), or a model problem (perhaps it isn't handling images taken with unusual lighting, or from a different angle, etc.). To do this, we can sort our images by their loss.

The loss is a number that is higher if the model is incorrect (especially if it's also confident of its incorrect answer), or if it's correct, but not confident of its correct answer.

interp.plot_top_losses(5, nrows=1)

Clean data after model error check

The intuitive approach to doing data cleaning is to do it before you train a model. But as you've seen in this case, a model can actually help you find data issues more quickly and easily. So, we normally prefer to train a quick and simple model first, and then use it to help us with data cleaning.

fastai includes a handy GUI for data cleaning called ImageClassifierCleaner that allows you to choose a category and the training versus validation set and view the highest-loss images (in order), along with menus to allow images to be selected for removal or relabeling:

#hide_output
cleaner = ImageClassifierCleaner(learn)
cleaner

ImageClassifierCleaner doesn't actually do the deleting or changing of labels for you; it just returns the indices of items to change.
To delete (unlink) all images selected for deletion, we would run:

for idx in cleaner.delete(): cleaner.fns[idx].unlink()

To move images for which we've selected a different category, we would run:

for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)

Model application

Using the Model for Inference

a model consists of two parts:

the architecture
the trained parameters

Prediction outcome:

('grizzly', TensorBase(1), TensorBase([0.0546, 0.9432, 0.0022]))
the predicted category in the same format you originally provided (in this case that's a string)
the index of the predicted category
the probabilities of each category.
The last two are based on the order of categories in the vocab of the DataLoaders; that is, the stored list of all possible categories. At inference time, you can access the DataLoaders as an attribute of the Learner:

learn_inf.dls.vocab

Outcome: (#3) ['black','grizzly','teddy']

learn.export() # fastai will save a file called "export.pkl"

# check if the file exists
path = Path()
path.ls(file_exts='.pkl')(#1) [Path('export.pkl')]

Deploy your app within notebook

# create our inference learner from the exported file
learn_inf = load_learner(path/'export.pkl')# get predictions for one image at a time
learn_inf.predict('images/grizzly.jpg')('grizzly', TensorBase(1), TensorBase([0.0546, 0.9432, 0.0022]))