XMAS-Project — Part 4: Creating your dataset with the help of Anomaly Detection

Daniel Manzke
4 min readDec 26, 2021

--

Update 29.12.2021 — Increase Input Size

XMAS time is an amazing time (with the family) but also a time, where yu can read and in my case try out, when everyone settles. Inspired by yoona.ai, a startup which is going to change the way how fashion companies are going to design in the future, my XMAS project is focussed around CNNs and GANs. (Part 1, Part 2 & Part 3).

I’m trying to generate a dataset, based on images I’ve crawled. Each product which I’ve crawled has roughly 5 images and we can only use one of them.

The image we are looking for.

These are example images from an eCommerce store. Only the dress without a person and in full shape, is something we are looking for.

I’ve implemented a Binary Classifier based on EfficientNet & Keras. I’ve wrote about it in Part 3 and you can find example code here Gist.

When I did it, I didn’t know what to search for. When implementing it, I thought why is there no way to train a ONE class classifier? When searching you will find several way, how it can be done.

Implement an (Variational-)AutoEncoder, which learns to how to reconstruct an image and you can calcualte the reconstruction loss. This way, you can train a network with only positive examples.

To be able to use Outlier Detection with AEs and VAEs I started using alibi-detect (Link), which seemed to be a good idea. Sadly most of the examples you can find, are based on MNIST, which uses 32x32 pixels.

I tried to change the inputs to 128x128, but alibi-detect uses numpy arrays in their fit(..) function, so I had to load all images into the memory and saw the first time, that my mac killed something. (after eating up storage with a swap file and an exhausted memory)

The 32x32 does have enough information to differentiate, between a close shot with a person + shirt and a shirt only.

Update: In the meantime I was able to increase it to 64 and 128 pixel, through adjusting the used encoder and decoder. The mistake was the in the reshape layer, where I didn’t adjust it as well I tried it with more Conv-Layers.
The memory problem is still the biggest issue and not being able to load them iteratively.

Disclaimer: I’m still getting into Python, Keras APIs, numpy etc. so I could also be me, who was just to stupid.

Validation Set (left) & Anomaly Set (right)

To be able to recognize untrained / new images not as anomaly, I would have to increase the threshold to atleast 0.015 or 0.02. When you look on the Anomaly Set, this would mean too many matches.

You can find the AutoEncoder Example here and Variational AutoEncoder with alibi-detect here.

Alternatives are using scikit with their IsolationForest, OneClassSVM, GuassianMixture with the help of a pre-trained CNN, using the feature extraction layer to leverage what has been learned.

Some juypter notebooks with different approaches. (Link)

As the AE and VAE are already touching the GAN space, I’ll give them another try. A good read can be found here.

Stay tuned if I’ll figure it out or go back to train binary CNNs, which have the highest accuracy right now.

--

--