Applying Deep Learning to Novel Applications or just a Side Project a brief summary

Time Bandit
Aug 27, 2017 · 5 min read

Or How to stir the pile just right 👌

From time to time I find a very compelling deep learning paper and I read it so I thought why not share my learning and thoughts with the internet ? This is a summary + some examples and Ideas on how to proceed on a new Deep Learning project.

The original paper which can be found here is titled “Best Practices for Applying Deep Learning to Novel Applications” by Leslie N Smith , this paper in particular discusses the engineering aspects of a deep learning project or application and gives some great advice on how to proceed ,

Phase I : Prepare yourself by Asking some questions

Before jumping onto the problem take sometime to put yourself in perspective ,define your project goals first ,put in consideration the time & computational resources you have .

Here is an example checklist to answer :

  • Do I have enough data or can I use the small dataset I have ?
  • What are the metrics and thresholds I have to define success is it accuracy or the number of false positives (Classification) ?
  • Do I need to break a precedent benchmark or I need to have a good enough solution ? Am I looking to surpass human level ?
  • What’s the performance of statistical models (Gradient Boosted Trees,SVM,Logistic Regression) ?
  • Where am I going to deploy my model ? (The engineering aspects of your solution, is my model capable to run on a phone ? how long does it take to make a prediction ? what’s the size of my model ?)

This is merely an example of things you have to put in consideration before starting and if you think Deep Learning is the only way move to phase 2 .

Phase 2 : Make the network’s training easier and make your data simple

The first part appeared on phase 1 in the paper but I think it’s much more related to data preprocessing than the assumptions one must ask .

Your neural network is a computer of some sorts instead of feeding it raw data and cross fingers you can make it’s job easier thus reducing time and also a good chance of a better output .

You have your data, now it’s important to do some exploratory data analysis before shoving everything in your model .

How ?

Here’s another checklist to answer :

  • Is my data related to the problem ?
  • Is my data balanced for example if you’re goal to classify and separated objects it’s not the same as classifying one objects from many (Cat vs Dogs & Cat vs Animal Kingdom 💭 )
  • Is my data biased (Data Bias is real don’t train a sexist or racist neural network we got enough of that with humans )
  • I don’t have enough data ? (Consider Transfer & Adaptive learning)
  • Can I synthesize and augment my data ? (Works great in Image recognition for example)
  • Do I have the Trinity Data Grail (Train-Test-Validation datasets) ?
  • Can this data be pre-processed (Most of time with numerical data YES ! not simple with Timeseries for example) ?
  • Proceeding with an initial Dimensionality Reduction or Feature Selection/Extraction is always good at the start

Phase 3:Maybe I don’t need to do this after all

The expert will use his previous work & knowledge to start from scratch a novice will face trouble doing that, so why not use expert’s previous work ?

You can find lots of litterature and code on Arxiv & Github new papers are out every week I like to browser arxiv-sanity to keep my sanity 😆 .

Also don’t forget about pretrained model fine-tuning a state of the art network like VGG16 can be enough to achieve results for image classification .

Phase 4:Engineering ways

Start small and easy and go from there

Starting with a small or common architecture,a common objective function and common hyper parameters settings (How to initialize weights for example), split your training data and do some toying here .

Think of it as a Programming Project

  • Don’t repeat yourself and use a popular framework (Keras is my favorite)
  • Run tests on your data preprocessing part & feature extraction make sure you are feeding correct data to your network
  • If you are running distributed jobs make sure you’re solid and that a node failure doesn’t affect you
  • If you don’t have the computational ressources use the many that exists (Paperspace,Amazon,Google Cloud,Floydhub)

Phase 5:Debugging and Visualization

Visualize your way to 99% accuracy,it’s far more convenient to look at graphical representations of what happening instead of numbers ,tracking your training process is crucial to know if you converge at all or you’re stuck …, Bias vs Variance for example can be detected if you’re network is converging on the wrong result (High Bias) , or Not at all (High variance) .
fixing these can be solved using more data or larger architecture (Dropout works great against Overfitting)

TensorBoard is your lord and savior learn to use it effectively and understand what everything represents (Weights monitoring,Accuracy & Errors …)

Phase 6:Science and Experiments !

Suppose you want to create a mobile application to classify let’s say vehicles (Car vs Boat vs Plane …) or Hot dog or Not the first is a multi-class problem,the second is one vs all your network only needs to figure out what a Hot-dog is actually .

So the first 5 phases aren’t time consuming you have a set of pictures and you want to start training .

  • Architecture design : Number of hidden layers ,depth of the layers,weight initialization ,loss functions,activation functions… experiment with different settings at different times try ReLU vs Tanh try adding Dropout or maybe an extra Conv Layer or maybe use Nesterov Momentum instead of Adam … the more you try and fail the more experience about what works or not you get .
  • Regularization : Data augmentation,drop out,weight decay,add Noise,Early stopping you’re goal is to prevent overfitting and have a network that generalizes well
  • Loss function : Experiment with different ones and know the difference between Loss function and evaluation metrics the first works on training data,the second works on test data

Don’t fear to explore and try different things and also look at the literature .

Phase 7: End to End, Boosting ,Complexity

This part focuses on state of the art and ensemble learning but I’ll use it instead to provide you with some fine papers you should absolutely read :

Conclusion:

Deep Learning is a very deep field 😉 and it’s not hard if you have an understanding of mathematics and programming you can use it and apply the latest research yourself,it’s a field where practice is more important if you’re goal is to achieve results on a side project you can do it !

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade