5 things to do before getting started with Deep Learning

“A robot named Pepper holding an iPad” by Alex Knight on Unsplash
Disclaimer: Along with the content from Searce team, you will also find a guest column where people outside Searce will contribute. This post by Kartik Godawat, is one such guest post. Watch this space for more :) — Editor

If you’re a web-developer working in the industry, you must’ve had thoughts about trying yourself out on Deep Learning. From generating Shakespeare styled text, generating celebrity faces to beating Atari games, the field seems to be everywhere.

If you’re like me, you’d also had probably signed-up for a popular AI course, but would find it difficult to complete the assignments and have shelved the course, probably never to look at it again. You might also somehow keep on continuing with the learning and put in those hours, working to the best of your abilities, but cannot shake the feeling that you’re not learning fast enough and maybe you are not good enough.

You start thinking: I was never good at math. Maybe this isn’t my cup of tea.

Here’s where you might be wrong!

The 10,000 hour rule

The principle holds that 10,000 hours of “deliberate practice” are needed to become world-class in any field.

While most of us have heard of this principle, many of us apply it wrong. There’s a difference between practice & deliberate-practice. Spending time to install dependencies, writing wrapper code to load data into memory, deploying code to production etc are part of the entire Deep Learning pipeline, but the time spent should not be interpreted as being equal to the time spent on Deep Learning.

In order to spend more time actually learning , we must strive to increase our productivity by automating the routine stuff and things like removing the dependency on stack-overflow, searching for same answers over and over again. To give you an example, “How do i train/val split a dataframe in Pandas?” is a question I had visited over 25 times in two months.

Here’s a list of tools/techniques I use, which I feel helps me focus on learning.


Prior Experience

If you haven’t worked with regression or classification problems before, it really helps to do a crash course on sklearn and understand traditional-ML models like RandomForests, XgBoost, Linear & Logistic regression, before directly jumping into Neural Networks. I personally found Coursera’s regression and classification by University of Washington to be very helpful.

I like to refer to this cheat-sheet as a beginning example. While not 100% accurate, it gives a good-enough baseline. If you feel youunderstand over 50–60% of the terms in the diagram, you probably can skip this step (but it’s nice to do a revision crash-course anyways).

Data-processing: Jupyter, Pandas, Numpy, OpenCV

These are not as scary as they sound, if you give them enough time. Jupyter notebook is a web-based editor, with a lot of extensions suited for productively working in data-science. Apart from installing it locally, it’s good to use Google Colab as google provides a free GPU instance to execute the notebook(which is super-helpful for Deep Learning).

Pandas, Numpy & OpenCV are the most popular libraries used to handle data in ML pipelines. I personally felt that in initial days, most of my time was spent around how to shape a numpy array, or how to do something in X. It definitely helps to build a good base on these as it has a compounding effect when it comes to saving time figuring out how to do the exact same thing again and again. Also, you don’t need to take community’s help everytime your input or output shape doesn’t match the tensor’s shape(of your Deep Learning model). I personally found Chris Albon’s notes to be pure gold.

To avoid risking distractions, while browser searching for something very trivial, I install howdoi, which is a command line tool to quickly get the syntax.

$ howdoi create tar archive
> tar -cf backup.tar --exclude "www/subf3" www
$ howdoi print stack trace python
> import traceback
>
> try:
> 1/0
> except:
> print '>>> traceback <<<'
> traceback.print_exc()
> print '>>> end of traceback <<<'
> traceback.print_exc()

Data visualization: Seaborn & matplotlib

Exploratory Data Analysis or EDA is a very important step in machine learning, which most of people do not spend time doing. I used to directly throw the input data into some well-established model, hoping to magically get a good accuracy/output. I learnt the importance of EDA the hard-way, after spending so many hours trying to train a model and never reaching a good-enough accuracy.

Seaborn visualization example

Seaborn is a high-level visualization library built on top of matplotlib, and handles most of the cases, but it’s good to have a basic understanding of matplotlib as well. Visualizing data and it’s dimensions before training almost always reveals a lot of useful information, which saves a lot of time.

Docker

Docker — Build, Ship, and Run Any App, Anywhere.

That’s all you need to learn. Since Deep Learning frameworks are evolving very rapidly, it is quite easy to fall into the trap of dependency issues on different machines. Logging your installation in a DockerFile ensures that exact environment is reproducible on a different machine, saving literally hours of time and stress over a minor version mismatch. It comes in very handy when I need to build tensorflow from source.

Build your feed

I personally use Twitter and Medium as my news-feed for providing me with excellent blogs and discussions. Freecodecamp, Hackernoon and TowardsDataScience articles are always a joy to read. Popular tech-company blogs like Slack, Netflix, Airbnb etc also publish regularly on medium. Curated articles like these could also be a good place to build a feed. I’d however advise not to over-do it and begin with what you feel comfortable with. Over-time a good feed automatically builds up. Also make it a habit of following authors of open-source libraries which you end up using to stay updated on the latest developments as an added bonus.

Build a Kaggle account early-on. There are a lot of public jupyter notebooks to learn from. The notebooks range from simple EDA or numpy tutorial to competition winning models. The discussion section there is also a good place to ask questions and learn from. Especially wrt generic doubts regarding ML, I personally feel more comfortable asking those questions on a Kaggle discussion forum than stack-overflow as sometimes doubts are too abstract and generic to fit the stack-overflow theme. Key takeaway here is to never stop asking for doubts no matter how trivial it might seem, or how many downvotes it might get, provided you have done your search to the best of your capability and are not looking for spoon-feeding.

The trouble with the world is that the stupid are so sure and the intelligent are full of doubt. — Bertrand Russell

In the beginning, most of the posts related to Deep-Learning won’t make any sense, like LSTMs, GANs for example. I do a cursory read to get an idea on what is being done rather than how it is being done. Over-time as the core DeepLearning skill grows, these articles and the cryptic twitter feeds will become meaningful.


Following these practices, really helped me scale up on my goal to be a ML/AI developer, and I hope it helps you too.

If you feel there are other pre-requisites or nice to have things to ease-out the learning curve, please add it in the comments.

Thank you very much for your time. If you enjoyed reading, please give me some claps so more people see the article. Thank you! And, until next time, have a great day :)