Getting Started : Deep Learning Project

Published in

The Startup

4 min readJun 28, 2020

Deep Learning(DL) have got some serious traction in the past couple of years and many of us are fascinated by these technologies and want to start learning working in these…

Most of us are familiar with what DL is, so I won’t cover that in this article, rather I’d like to focus on how to start a DL project.

Research, research and more research :

Bet you thought I’d start with data pre-processing, but NO… And there’s a good reason for it. Before you start any project, you need to research about the project you’re about to undertake, the technologies and domain knowledge that is required for the same. DL projects revolve all around data, and proper dataset and it’s processing is the key to getting good results.

Points to consider while researching :

Form of data :
Identifying the form of data (csv, image, audio, text, etc.) is very important as the pre-processing to be done on the data may require a lot of domain knowledge.
For example : a typical csv dataset may be easily worked on, but for say audio data, one needs to have domain knowledge to select and process attributes from the audio to get best results
Type of Network:
Great, now that you’ve identified the form of data, it’s time to select an approach. In DL, one can select different type of networks amongst Artificial Neural Network(ANN), Convolutional Neural Network(CNN), Long-term Short-term Memory Networks(LSTM), etc. Reading research papers (Google Scholar is a great way to search) and approaches other developers have used will help to decide an approach of your own. Study the approach you select and different forms of it.
Tools, libraries and environment:
Select a deep learning library you’re familiar with (Tensorflow, PyTorch, Keras, etc.) and according to mode of deployment of your model. Personally, I love the simplicity of Keras and I’d recommend the same for beginners.
DL is computationally expensive, so if hardware is an issue, Google Colab is a great option where you can code your project from any device, and it runs on powerful hardware provided by Google (And its Free!). Completely online coding environments like GitPod are also useful and have Git integration built in with them.

Image representing forms of data : Text, audio, images and csv. — Forms Of Data : Text, audio, images and csv.

Acquiring the dataset :

Finding a dataset as per your requirement can be tricky and remember that you might not find exactly what you are looking for, so be ready for some data processing of your own. That said, some popular places to search for datasets are :

For a more comprehensive list of websites to search for datasets, see this article.

If your project requires completely different kind of data, you can also get a personalized dataset created from Amazon or Google, but that will cost you.

Data Pre-processing :

According the form of data, pre-processing can contain various steps (More about that in upcoming articles), but some steps remain the same irrespective of the form, they are :

Separating dependent and independent variables:
In simple terms, dependent variables are those that you need to predict, and independent variables are ones using which we predict the dependent variables. These need to be separated as only independent variables need to be passed to network while dependent ones are required to calculate difference (loss) between them and the values predicted by the network
Splitting dataset into training set and test set :
We will train our network on certain section of data. Testing determines the real world working of our model. Consider an example where you have been taught a new word and your teacher gives you an example by using it in a sentence. Now, to show that you have got the meaning of that word, you must use it in another sentence, using it in the same sentence is useless. Similarly, testing our model with the same data on which it was trained is useless. Thus we split the dataset and reserve some part of it for testing, which will give us a real estimate of how good the model works on previously unseen data.

CSV Data Processing Pipeline For ML/DL Projects Using Python

Data processing pipeline with code snippets for ML/DL Projects.

medium.com

Image Processing Pipeline For ML/DL Projects In Python

Convolutional Neural Networks(CNN) are gaining popularity rapidly, and are being applied for various image…

medium.com

Thankyou for reading, any feedback/suggestions are appreciated.

More about me at :

LinkedIn :

www.linkedin.com/in/rohan-hirekerur

GitHub :

Rohan-Hirekerur - Overview

Dismiss Sign up for your own profile on GitHub, the best place to host code, manage projects, and build software…

github.com