Direction Among Chaos

Shivang Ganjoo
ScienceforData
Published in
3 min readFeb 6, 2019

Money making online courses and false narrative are leaving you directionless!

Source: rajrajhans.com

Why is there so much buzz around data science at the moment?

Data science is not a new field. It has been around for many decades now. With the advent of technology and internet, data generation increased exponentially. This presented an opportunity for companies to bring this academia phenomena into industry, mainly to increase profits through optimization, understanding their customers better and making innovative software. People got and are still getting excited by the job title, money, applications and the surprisingly easy to use off the shelf libraries for implementation. But one thing to understand here is science in Data Science! You must be good at mathematics and have domain knowledge to be a good data scientist.

What are the roles in data domain?

There are two major roles in this domain, scientist and engineer. It is quite clear from the names what work they do. A scientist performs analysis, creates algorithms, does various types of testing etc. Whereas, an engineer brings the researched algorithm to production in an optimized way, train and retrain models, design ML systems etc.

Source: LearnDataScienceOnline

Prerequisites

A data science aspirant should have basic knowledge of linear algebra and calculus. He/she should also have a strong background in probability and statistics. There is no big deal learning these concepts, watch the probability and statistics section of our publication if you want to start learning or revising these concepts. A data/ML engineer aspirant should have a strong coding background in C++, python and various ML frameworks. SQL is another important must have. One way to demonstrate your skills is making cool applications and performing well in competitive coding.

Source: ParallelDots

Harsh Truths

  1. Data Wrangling occupies 60% percent of the total project time. This includes data acquisition, cleaning, feature engineering etc.
  2. You need a lot of patience. Sometimes expected results won’t come with the path you choose. Try different things, explore and research.
  3. There is no clear problem statement. Most of the times you get a one liner, I want my user-website interaction time to increase.
  4. Real world data is super messy. No matter what you’ve worked on Kaggle, KdNuggets etc. you will not have seen messy real world data until your internship.

I’m writing this article so that you can have a clear picture of what to expect. This will help you in assessing if data science is the right choice for you. Time is precious, use it somewhere else if you think you don’t love mathematics.

--

--