Thanks Radek 7th place solution to HWI 2019 competition: Metric Learning Story

Anastasiia Mishchuk
4 min readMar 1, 2019

--

Just recently we have finished 7th in the competition https://www.kaggle.com/c/humpback-whale-identification.

The goal — to identify humpback whales by their tails: either whale is one of the known 5004 or it’s a “new_whale” — unknown yet).

The dataset is very challenging, as it is coming from real life, where you have just one image to identify whale, multiple images (read like — very unbalanced dataset), or “new whales”, those whales that are unknown. Images at any scale, from any view, different (old — whales were young and beautiful, new — they grew up and changed, though stayed beautiful :D )) ).

Moreover, dataset contains all kinds of errors: same whales under different ids, different whales under same id, same whales, labeled both “new whale“ and another id, same time.

Sounds like fun, huh?)

Igor invited me to participate in this challenge and the journey began :) We’ve created a team and later we teamed up with Dmytro.

Our part of the work was focused on metric learning.

Approach was easy and straight-forward at the beginning, here are all the major parts:

  1. We’ve sampled uniformly images from each class: two images per id — anchor and positive image, on every epoch. We started with using LAP, but moved to classical hard negative mining strategy (collecting all the features during training and calculating hardest anchor-negative image after each epoch) to speed up the process.
  2. Validation set: we had really small one at the beginning, all images, that have more than 3 images per id — one of them query, another gallery + new_whales as distractors. We’ve been using one-shots (augmented positive image) for training. We had query/gallery approach. Later we moved to the data from the playground competition and evaluated our models on it.
  3. Igor labeled and created a detector, boxes from which we were using most of the time of the competition.
  4. After getting good enough models, we created and verified a list of bad images and removed them from the training.
  5. We’ve tried various image sizes, we were training on RGB images: 256x256, 384x384, 448x448, 360x720.
  6. We used augmentations: random erasing, affine transformations(scale, translation, shear), brightness/contrast.
  7. The models we’ve used are: resnet34, resnet50, resnet101, densenet121, densenet162, seresnext50 — backbone architectures that we’ve tried, followed by GeM pooling layer +L2 + multiplier.
  8. Loss: hard triplet loss
  9. We’ve also used cyclic lr and taking features (max+average pooling) of 3–4 different checkpoints. It helped, at the beginning…)

Before we teamed up with Dmytro, we had an ensemble of 10–12 best models. Then we’ve moved to a new pipeline, where our models served as initialization for the classifier models, that actually worked really good for us.

More about this part of the journey and our final solution read here: https://medium.com/@ducha.aiki/thanks-radek-7th-place-solution-to-hwi-2019-competition-738624e4c885

Also, we’ve tried things… :

  • Segmentation model. We’ve labeled and trained Unet, extracted segmented images with small dilation of the segmentation masks and tried training on those segmented images, which doesn’t improve the performance for us.
  • Keypoints detector. We additionally labeled and trained keypoints detector for images alignment. We did the alignment. Finally. It helped, but not much.

And some more…)

  • We labeled data and created half whales classifier. Further, we’ve replace those images that were only half whale, with aligned, cropped and mirrored — to get “whole” image of the whale.
  • Class balancing of one-shots: Getting top-1 frequencies of classes and replacing “new whale” class with classes that are one-shots and not presented @ top-1.

Quite late we found out that there are much more same classes than we thought before, as it affected our training with classical hard mining, so we cleaned the dataset (merged same by selected distance threshold) and re-trained two days before the end of competition.

And tried some more… That doesn’t work:

  • PCA+Whitening
  • DBA
  • Pre-trained from classifier VGG16
  • Contrastive loss + hard mining under our basic setup
  • Layers fusion (worked for resnet34, doesn’t work later for other architectures)
  • Probability classes balancing
  • Re-ranking
  • Pseudo labels
  • TTA worked only for flipping images adding really small percentage and doesn’t work for anything else.

So, that is the story)

I want to thank my wonderful teammates, organizers, Family, Universe and sure, Radek! :)

--

--