How we handled
a Computer Vision Hackathon

Published in

L’Actualité Tech — Blog CBTW

15 min readDec 6, 2017

The goal of this post is to share some thoughts we gathered on how to handle a Computer Vision Hackathon, based on our experience of the Pierre Fabre competition hosted in Deauville on November 24th & 25th, 2017.

This post is not a tutorial on ConvNets neither Keras. We are not asserting that what we’ve tried or what we are telling here is the only path to follow, we simply want to share. Just share. That’s all ! And we’ll be glad to receive your feedbacks.

Here are the different (quite independent) topics we deal with in this post :

Get your environment ready
Code, deploy, monitor
About Keras 2.0 & GPUs
The “ConvNets Battle Plan”
Images ROI & Augmentation
Crisis Management
Last piece of advice : have fun !

Get your environment ready

Microsoft was one of the partner of the hackathon we competed in, so we decided to host our environment on Microsoft Azure. It’s a good idea to have someone in your team who has, at least, a little experience on cloud environments with virtual machines, GPUs and disks. Between Amazon Web Service, Google Cloud Platform, Microsoft Azure or others, just pick one and you will be sure to be able to launch VMs with GPU chips optimized for computation (Nvidia Tesla K80, P100…)*.

On our side, here is what we used on Azure :

1 NC-6 (1 GPU K80, 6 vCPUs, 55 Gb RAM)
2 NC-12 (2 GPUs K80, 12 vCPUs, 110 Gb RAM)
1 File Storage Disk of 500Gb

The first thing to do is to put in your repository all the script you use to prepare your VMs. That enabled us to create our new VMs and to make them ready-to-use easily. For example, if you want to attach and mount a File Storage Disk on a Linux VM on Microsoft Azure, you will need to run this script :

source : https://docs.microsoft.com/fr-fr/azure/storage/files/storage-how-to-use-files-linux

I don’t think it is good idea, for a short challenge like this one, to try to install all the packages from scratch, the drivers for your GPU, your environment, etc. Cloud platforms like Azure have images with your favorite framework installed for sure. On AWS, look for AMIs in the marketplace (i.e. this one http://vict0rsch.github.io/2017/02/22/p2-aws-tensorflow-1/). On Azure look for Virtual Machines Images such as the “Deep Learning Virtual Machine” where you will find TensorFlow, Keras2, Caffe2… This is the one we used.

Two other points we wanted to share with you on that topic :

Give your different virtual machines (funny) names. It will, first, enable you to not mismatch what is running on which vm because just using IPs or VM1, VM2 as names is really confusing. Secondly, you’ll see that tense period will be easier to handle during the challenge when you tell your colleague “CholletCacao is dead” rather than “vm-gpu-1 is down” !
Be aware of the cost of your VMs. We all know that GPUs are not cheap. When launching two, three or even more instances, your bill can quickly explodes. In order to handle this kind of issue, try to start your VMs only during computation time. If the challenge is short enough — as the Pierre Fabre challenge — you won’t have time to think, all your VMs will have to run non-stop with different tasks.

Deploy properly & quickly

You have to run the training of different models (in parallel or not) without loosing time. Because it takes time for a ResNet50 or an Inceptionv3 to converge. That’s why we tried to have some deployment scripts ready to update the code on servers smartly. And we tried to build our repository in order to launch this or this model just by specifying arguments which refer to configuration files where models details are defined.

Actually, that was the theory. The first point did not go as expected. The pressure of time sadly pushed us to handle this deployment strategy really badly. First, we never settled any devops process in order to automatically deploy a specific branch on the VMs. We were just connecting to the machines via ssh, pulling code from remote repository and launching our application from there. And secondly, we messed up a bit everything by changing branch directly on the instances or making quick fixes with good old vim.

In our opinion, if we had to start over this competition, we would ban the code edition on the VMs (we know it sounds obvious now, but we can assure you that obviousness seems totally different when competition ends in 4 hours and you haven’t slept during the last 20 hours). And we would prepare some easy script fetching and pulling on the machines and then starting a training within a screen (useful to monitor afterwards).

Regarding the second point, we prepared our code to be as modular as possible, in order to launch easily one VGG with these pre-trained weights on BigHinton, and a second after, launch a prediction of Inception on CholletCacao.

For that, we’ve designed a config file containing details we didn’t want to repeat for each launch and a main.py handling arguments to pass to our different functions.

Extract of our config file to easily deploy trainings for different convnets

And then, for instance, to run a training with VGG architecture initialized with some weights dumped in a file, we just have to execute the command :

python main.py train vgg /sharedfiles/vgg_pretrained.h5

Monitor your models

To manage your time and be able to take decisions such as stopping the training phase of one model, you need to monitor your models. And if you are using Tensorflow (either natively or in Keras backend), you really should use the Tensorboard in order to see what is going on during your training phase.

How to use it with Keras ? Just add a callback function that will write logs to a specific directory as you can see here :

Example of a train function that launches the method fit with TensorBoard in callbacks

And then, in a screen for example, you execute the tensorboard command specifying the log directory you set just above.

tensorboard --logdir /tmp/tensorflow

If your virtual machine has the correct rights for inbound/outbound ports and has a running server such as nginx or apache, you will normally be able to see :

The whole code is available on our GitHub repo and can be an interesting starting point if you are enrolling the online competition or if you want to play with the ISIC dataset.

Know your framework

Aswe explained previously, we deployed the Deep Learning Virtual Machine (DLVM) on our different NC-6 and NC-12. In our team, we had some experience with Tensorflow 1.0, Keras 1.6, Caffe 1 so the choice of the framework was more linked to the requirements present in our battle plan. As you’ll see in the details of our battle plan, we wanted to use well known Convolutional Neural Nets such as ResNet or Inception and to initialize with weights computed on ImageNet training. Usage of fancy custom neural networks was not part of our plan, that’s why we’ve decided to use Keras on top of a Tensorflow backend.

You can see that on the DLVM, Keras is installed on version 2. That did not bother us at the beginning, convinced that our experience on Keras 1.6 will be solid enough to master the version 2. That is a mistake we payed cash later on during the challenge.

The data provider generator of Keras 2.0

What we implemented first for the data provider of our neural nets was based on the ImageDataGenerator of Keras whose role was to walk into a directory, structured with one subdirectory by class (here malignant and benign), read images by batch, increase number of images by transforming them with classical image transformation (rotation, flip, zoom, etc.) and finally feed the network with those batches.

Keras Data Generator we were used to use

When we launched our first trainings, we were surprised by the time taken to complete the different epochs of our models. The issue was there : GPUs were just loading data in memory but not to doing parallel computations. A look at the output of nvidia-smi gave us the clue :

We knew the issue was not coming from the installation of the libs or the drivers since Keras 2.0 was correctly using the GPUs when we tried to run the code of the CapsNet repo for Keras.

We tried first to implement a custom data generator but the issue was still there. So we chose to load as many images as possible into memory before feeding them the network by batch. You can see in the code that we used the lib multiprocessing to not loose too much time during the loading step of our process.

Extract of our custom data provider file

The images we loaded into memory were already augmented by our image processor method (see below), so we couldn’t load the whole 50k images in once. That’s why we had to train a network with only some kinds of transformed images, dump the weights, flush the memory and then reload some other kinds of transformed images and train the network by initializing it with the dumped weights to continue the training where we stopped.

Handling multi GPUs with Keras 2.0 and Tensorflow

Another issue we should have deal with before : handling multiple GPUs. We figured out during the competition that Keras 2.0 is not handling the two GPUs by itself on our virtual machines. To do so, you have to prepare a trick to force both GPU 1 and 2 to execute the computation tasks.

As you see, you need to copy the network in each GPU, slice the batches to broadcast them into the different GPUs and at the end merge the outputs on the CPU. Not so obvious, isn’t it ? But we managed to use this enriched version of our code for the last trainings and it saved us some precious minutes.

Prepare your battle plan

It’s not a good idea to start a hackathon without knowing in advance what model you are going to try, which network you should use, etc. First because you will lose time to decide and secondly because you do have time to read what others tried on similar problems.

On our side, we reached this architecture at the end :

It is not far from what we planned : use 2 or 3 more different convnets before applying an ensembling approach with an xgboost classifier or just a weighted average method.

Two sources inspired us to design this goal : the video and paper of Brett Kuprel & the Google developers about Skin Cancer Image Classification and the different methods explained by the winners of Kaggle image competitions on the Kaggle blog :

TensorFlow au Dev Summit 2017

Dogs vs. Cats Redux Playground Competition, Winner’s Interview: Bojan Tunguz
The Dogs versus Cats Redux: Kernels Edition playground competition revived one of our favorite “for fun” image…blog.kaggle.com

We also decided to add two fully connected layers on top of our networks in order to increase the level of abstraction without losing the knowledge present in the previous layers from the training on ImageNet by de-freezing them. That enabled us to only make the last fully connected layer of each network retrainable and still have models that were good for this really specific task of classifying melanoma.

As the total number of images in the training set was not this big and really specific, we did not want our models to overfit on them. That’s why we added some regularization functions for the backpropagation and a dropout layer between our two custom fully connected layers.

Extract where we add 2 dense layers and 1 dropout layer

Since we had to handle the different GPU-crisis, we didn’t manage to train as many models as expected. We preferred to focus on ResNet50 and InceptionV3, trying to optimize the different parameters we had (number of custom layers, number of layers to retrain, shape of our layers…).

Finally what worked best for the ensembling method was to apply a weighted average on the output of the different softmax layers (with a weight of 0.7 on ResNet). We also tried to feed the outputs of the fully connected layers as inputs to an xgboost forest but ended up having lower precision on our validation test. It may be due to the fact that we finally could not have time to optimize the hyper parameters of the xgboost and so just the used the ones by default. With some steps back, I guess it would not have changed this much the level of precision of our model.

We could have had a more precise model either by adding more networks to our “ensemble” approach or spending more time on the visual pre-processing of the images.

Process images smartly

All deep learning architecture and machine learning algorithms are meaningless if your data is crap. As we discussed before, convnets architectures are by far the most efficient and used in image processing and classification. After a quick look at our image dataset, we quickly saw that images weren’t equal in term of position (e.g ROI of unequal sizes), shape (e.g object on a circle with black corners due to device used), occlusions (e.g markers, pubic hairs,…) or even colors (e.g. blue light due to flash). Another massive constraint was the diversity of image sizes present and the possible issues regarding a uniform resizing.

We decided to preprocess our images with respect to those conditions :

High contrast (smallest value is 0 and highest 1)
Color channels (red, green, blue) normalized (mean=0 and std=1 for each channel)
Extract only the interesting region by removing other factors (skin, markers,…)

*Dataset image (left) and contrast increased and color channel normalisation (right)*

Contrast increasing and color channels normalization can be easily done by matrix and value manipulations. Due to time constraints, we decided to use a simple, but effective method to extract our ROI

Resize images, crop to keep ratio and have a fixed 500x500 image size
Create an image mak separating the biggest interesting region from the rest by simply blur images to remove small details and apply a threshold at 0.5
Extracting a box around the mask and use it as information for cropping

We ended up with some pretty cool ROI in the end, with images being squared and possibly of different sizes. We accepted small deformation in the Keras input pipeline since shape wasn’t totally discriminant in malignant classification.

ROI extraction examples (left : before; right : after ROI.

As mentioned, we were positively surprised by some resulting images, but others were quite disappointing.

A simple solution can be lifesaving in a critical moment, however the results obtained weren’t overwhelming compared to what we expected. Pierre Fabre’s hackathon winners did some pretty smart image processing by carefully remove images not useful or harmful and decided to use an already trained deep learning architecture to define and extract the ROI.

We strongly believe that the careful processing and selection of images from the dataset had a major role in the competition’s outcome.

On the other hand, image preprocessing was the first step to increase image quality, but a rough selection would reduce our dataset and hence reduce our model generalization for classification. Classic methods of data augmentation are used with CNN and we decided to do a deeper research on how it was applied to specific skin cancer classification. We decided to follow a similar procedure done in https://arxiv.org/pdf/1702.07025.pdf, but only for geometric augmentation, instead of reproducing it on Keras.

*Geometrical augmentation with mirroring completion to avoid black borders*

A major advantage was the reduction of training time, since images only needed to be read and resized. Unfortunately, as previously mentioned, our reimplementation of a data provider to properly use our GPUs lead us to fully load images on GPUs memory. Thus, we encountered memory issues and we were forced to use a sample of our augmented dataset and train multiple times our models.

Crisis Management

Asmany people competing to Pierre Fabre’s Hackathon, we came with only a few features to implement and thought we’d have time for the models to train. Unfortunately, as always, unexpected issues and behaviors arose and with it a key element that we wish to share : crisis management.

Things such as panic, frustration and time loss start to appear with unforeseen events and some keys questions need to be asked before spending too much time on it :

Is these features related to our problem mandatory ?
Do we see any solution that can fix our problems rapidly ?
What are our other objectives ?
Is there any other possibility, even if it is simpler and worse in terms of results ?
Can it be put on hold for now ?

To avoid ending without any proper solution and to cope with the growing tiredness, we decided to followed the following checklist :

Put on hold everything that wasn’t mandatory,
Define a time limit constraint to fix the issue,
In the same time look for a possible backup plan.

We faced a major issue that could not be avoided and another minor one that, if resolved, could have brought us salvation :

GPUs not used during training and testing.
Impossibility to use properly multiple GPUs and save our trained model at the end.

Our backup plan for the first issue was a total reimplementation with Caffe, that was well known by one of our teammate. It would have been a nightmare to proceed like this and it brought us a surge of motivation.

Concerning the multiple GPUs, it had been put on hold multiple times and, with a fresh mind, a solution has been found.

Those issues took time, great efforts and multiple tests before being able to work. This loss impacted our overall plan to try more architectures and to add some features that could have improved our score. However, unlike many teams, we ended up with a solution that was among the bests.

Being able to step back when needed, be organised and to take decision is a major advantage in a time limited event like an hackathon. We realized that having a final working model was more important than having some half-assed killer features, but without any features in the end.

Have fun and share

Hackathons are exhausting and stressful. So if you come at hackathons with the only objective to code, code, code, you won’t enjoy those kind of events fully.

Hackathons gather lots of smart & inspirational participants, ready to stay up all night long to tackle a problem. Hackathons gather managers from companies that come to share data and experience with you to make their companies go further. You cannot come and ignore that. Don’t be shy to share with people around you (even if they are your competitors during few hours, they can become colleagues or friends afterwards !).

Do not take everything seriously, most of what you will implement during Hackathons won’t be used as it is, even if you win. So you definitely can allow yourself to name your servers with funny names (BigHinton, FatMostafa & CholletCacaco for us…), your team project with a name jury members will remember, etc. But stay professional in your code, this is the core of your product.

We did not win this last competition (for few hundredths of a bloody average_precision_score) but we learnt a lot, met plenty of interesting people and had a lot of fun ! We look forward to the next Data Science Hackathon and would be glad to see you there !

LES AUTEURS

**Arnaud**
Data Scientist & Data enthusiast for social good

**Christopher**
Data Scientist, Research enthusiast and Techno music adept

A special thanks to Julie, Thomas, Edvin & JB for their reviews of the post.

(*) At the beginning, we planned to use AWS but discovered we needed to ask for quota expansion in order to launch multiple g2.*large instances. Quota that we received 2 days after the end of the challenge. If you use a pay-as-you-go account on Azure, you’ll also need to ask for an increase of quota to launch more than one NC6 virtual machine (instance with 1 K80 and 6 vCPUs). We received a positive answer within 24h for this request.

How we handled a Computer Vision Hackathon