What data science is all about ??

Nishant Rastogi
Firing Neurons
Published in
3 min readJul 29, 2019

Data science is a buzz word these days. More and more people are making or planning to make a career switch to data science. Many institutes have mushroomed who claim to be the best and have fancy curriculum covering variety of algorithms. Some have taken it a step further and have added deep learning and Big Data as well. While some have tie-ups with online platforms, some provide classroom environment. In a nutshell, there are enough ways and means available to guide you into a data science career. However, what many of these institutes don’t mention upfront is what data science is all about and what are the prerequisites that one must have in order to make a successful career in the field. Here’s a list of some must have skills required for sound understanding of the concepts and making a successful career in data science:

  • Inclination for mathematics: Whether it is a Machine Learning algorithm or an AI algorithm or a Deep learning algorithm for that matter, the underlying concepts in all of them are mathematical. Understanding any of the algorithm requires understanding of a mix of mathematical concepts like Probability, Calculus, Linear Algebra, Set theory, Matrices, Geometry, Trigonometry, Vectors etc. While you don’t need to prove every single formula coming your way, the knowledge of the mentioned mathematical areas will help in understanding the underlying concepts of the algorithms and how they work.
  • Knowledge of statistics: There’s a saying. “A person with a watch knows time. A person with two watches is never sure.” The statistical point of view is that it’s better not to be sure. However advance algorithms we may create, we can never make 100% accurate predictions all the time. Hence all predictions come with some statistical measure of probability of the predictions being accurate. Hence data science requires a sound understanding of statistical concepts like central tendencies, variability and spread, distributions, Hypothesis testing etc.
  • Intellectual curiosity and Analytical thinking: Data has a story to tell. One of the areas where the data scientists spend most of their time is exploring the data. This requires slicing and dicing the data, looking at variable distributions, determining dependencies between variables, plotting the data, visualizing any trends or patterns, identifying and handling the outliers, summarizing and understanding the data. Essentially, this is the most important steps in the data science life cycle and is about extracting the essence from the data. Without analytical thinking, this task can’t be performed.
  • Programming skills: While many programming languages can be used for data science, R and Python are the ones being used most prominently in the industry. While R is primarily a statistical programming language with lots of advanced features specific to machine learning, Python is a full-fledged object oriented open source programming language which is catching up fast with R in terms of the Machine Learning capabilities. For AI, it is already the preferred language. Along with R & Python, knowledge of SQL is a must. SQL help in inserting, extracting and updating and deleting the data from structured databases like SQL. When dealing with extremely large data sets, a sound understanding of Big Data architecture, Hadoop, HDFS and Apache Spark is bare minimum.
  • Communication skills: The work of a data scientist involves lot of user interaction whether it is interacting with business users to understand various attributes of data, interaction with IT to understand technology landscape and systems from where data needs to be extracted, working with executives for building strategies, interaction with business teams to convey the results of the analysis done etc. The key job of a data scientist is to enable the business to make decisions by providing valuable insights to them. Hence a data scientist must have excellent communication skills and must master the art of story telling.

Please note that this is a concise list of few of the important aspects of data science and is not a complete list. For more articles related to Data Science, ML, AI, Big Data and related technologies, stay tuned.

--

--

Nishant Rastogi
Firing Neurons

Experienced Engineering Leader with nearly 20 years of expertise in developing and delivering data and analytics solutions for global organizations.