Milan Roy
Edvicer
Published in
10 min readDec 10, 2019

--

Today, I am going to tell you everything you need to know about the field of data science. After you go through this article, you will understand the differences between various roles of this field, skill-set required to get a job, salary statistics and job opportunities. So let’s get right into it!

Today, in this age of technology eclipsed by the hunt for information, the Data Scientist- oh sorry! Data Analyst (this doesn’t sound right), Data Engineer (not this either), is king.

If you find yourself thinking “Dude they are the same thing.. Oh wait what they’re different?!” read on, prepare to be more than just surprised, and hopefully a little inspired.

Data Engineer, Data Scientist, Data Analyst are terms thrown around as though they mean the same, but do they really?

Contrary to popular belief, these titles differ more than just the name and together, let’s find out how they do and how you can acquire relevant skills to hold any of these job titles someday!

Data Scientists, Data Engineers and Data Analysts are different lines of careers, the ambiguity that goes unnoticed for most people that aren’t well aware of these domains is warranted though. They have an overlap with respect to a few skills that are required to hold a position In each title, as well as some of their responsibilities.

While Data Scientists develop and enhance current methodologies, approaches and algorithms, Data Engineers’ work involves a more practical approach. Data analysts’ work involves more of modelling and simpler representation of data to help make decisions that are of paramount value to various organisations.

Having had a glimpse of what these professionals are individually responsible for, let us now have a ‘deeper’(very bad pun, I know) look at the array of skills that each of these job titles require and all that you need to do, to find yourself working as one someday!

Data Scientist:

All businesses could use a garden where Data Scientists plant seeds of possibility and water them with collaboration.

Data Science is a field of study that exploits the amalgamation of programming skills, mathematics and statistics and domain expertise in order to help draw relevant insight from the data.

A Data Scientist is a practitioner who applies Machine Learning algorithms to data such as images, videos or even plain numbers and text to assist those that can interpret these insights into tangible and logical conclusions that can for example, add to business value.

Required skills for a Data Scientist:

Fancy names of tools and technologies that are often thrown around tend to intimidate potential Data Scientists leaving them with more questions than answers.

However, before moving into the specific skill requirement off late, it is worth mentioning that a prevalent opinion amongst seasoned Data Scientists is that the mathematical fundamentals tower over everything else and it is of paramount importance to master this aspect of Data Science. These mathematical fundamentals include:

  • Matrices
  • Discrete Mathematics
  • Linear Algebra
  • Basic Statistics (Mean, Median, Mode, Standard Deviation etc.)
  • Random Variables
  • Probability theory
  • Probability distribution functions
  • Correlation and Regression

After you ensure that you have these concepts at the tips of your fingers, these are the skills you would need to get your hands on to successfully become a data scientist:

Programming

This is the second block of the pyramid, laid atop said mathematical fundamentals. The importance of knowing to code is self-explanatory in this day and age for most professions, especially in the case of Data Science. It is highly recommended to learn Python or R language.

Machine Learning

Machine Learning algorithms are the block of the pyramid placed above the ability to code. Unequivocally important in data science, one must have some level of experience with Machine Learning. One is advised to be well versed with Supervised and Unsupervised learning models such as:

  • KNN Algorithm
  • Clustering
  • Decision Tree
  • Random Forest
  • Linear Regression
  • Logistic Regression

Data Visualization

Visualizing and communicating data is incredibly important, especially because data scientists are viewed as people who help others make data-driven decisions. When it comes to communicating, this means describing conclusions, or the way techniques work to audiences, both technical and non-technical. Off late, tools such as Matplotlib, Tableau, Google Charts to name a few have gained popularity in this regard.

Data Intuition

As the name may slightly suggest, this involves developing a refined approach towards data-driven problem solving. As a data scientist, when presented with a large dataset, you are expected to make out what is important and what is not, what method may prove useful and so on.

Data Wrangling

Often, the data you’re working on is going to be messy and difficult to work with. Because of this, it’s extremely important to know how to deal with imperfections or anomalies in data such as missing values, or incorrect date formatting or inconsistent string formatting.

Job Opportunities in Data science:

There has been an overall growth in the number of jobs in analytics and data science ecosystem with India contributing to 6% of open job openings worldwide. The total number of analytics and data science job positions available is 97,000. Out of these, 97% job openings in India are on a full-time basis while 3% are part-time or contractual.

Python continues to be the tool of choice among data analysts and data scientists and this is reflected in the hiring market as well, with 17% jobs listing the language as a core capability.

Hiring trend indicates demand for junior level talent has increased compared to the previous year, the hiring trend has been more favourable for young talent with 21% jobs being posted for those fresh out of college.

Salary Statistics for Data Scientists:

A career in data science is another way to say that you’re surrounded by job offers with lucrative salary packages. For a data scientist in India having up to an experience of 5 years, the average annual basic pay alone stands between INR 8.5 lakh, to INR 11.5 lakh.

For a senior professional in this field, the average annual pay jumps to around INR 19.5 lakh per annum. India’s scenario with regard to data science is only expected to improve dramatically over the years, and so the future is bright, perhaps too bright to see!

Relevant Courses:

  1. Python
  2. R
  3. Machine Learning
  4. Data Visualization

Data Engineer:

A scientist can discover a new star, but he cannot make one. He would have to ask an engineer to do it for him

Data Engineering is a domain that focuses on the practical aspects of data gathering and its analysis. All the work carried out by data scientists in interpreting the oodles of information to answer relevant questions, will require certain mechanisms to be put in place that validate and collect this information. For this work to hold any real value there also have to be mechanisms for applying it to real-world operations in some way, this is where data engineers are required.

A Data Engineer focuses on the applications and harvesting of big data. Their end of the bargain entails a more practical approach which involves creating interfaces and mechanisms for the flow and access of information.

Skills required in becoming a Data Engineer:

There is a significant ambiguity between data science and data engineer roles and skills. So, one can easily face confusion about what skills are essentially required to be a successful data engineer. Of course, there are certain skills that are a common requirement amongst both roles, but after an extent, the data engineer needs to acquire a different set of skills, which are:

Data Structures and Algorithms:

Using the right data structure can drastically improve the performance of an algorithm. This is of great importance to improve your panache as a data engineer, as because with most engineers, your career is characterised by the practical aspect of things and would require optimum performance.

SQL:

A data engineer’s career pretty much revolves around data. And in order to extract this data from the database, you need to interact with it in the same language.

SQL knowledge will play an integral role in your career in data engineering since you will definitely have to create queries to extract data. Most big data software and tools support SQL, Oracle, SQL Server, Amazon Redshift to name a few.

Programming in Python with Java/Scala:

The need to learn python is something that’s already been mentioned earlier. As for Java and Scala, most of the tools for storing and processing huge amounts of data are written in these languages. For example:

  • Apache Kafka in Scala
  • Hadoop, HDFS in Java
  • Apache Spark in Scala

Big Data tools:

Big Data is an upcoming field that is expanding its application into virtually every industry. For this reason, there is an increased demand for engineers who can work with Big Data in almost every big company. Some big data tools that are quite popular in the industry are:

  • Hadoop
  • Oozie
  • Storm
  • Spark

Machine Learning:

Machine learning makes an important contribution to data engineering and machine learning algorithms are used quite frequently and prove its necessity in this field. There is still a scarcity of professionals that can effectively use machine learning for carrying out the prescriptive and predictive analysis in India.

Job Opportunities in Data Engineering:

According to an Accenture study, 79% of enterprise executives agree that companies that do not embrace Big Data will lose their competitive position and could face extinction. Even more, 83% have pursued Big Data projects to seize a competitive edge.

Data Engineering in India has a mammoth scope all over the country although the field hasn’t yet developed to a very appreciable stage. There are still a very good number of opportunities and job openings for data engineers in India, and is just expected to grow tremendously. Platforms such as Indeed, LinkedIn, Times Jobs, Monster, Naukri.com etc. show over 65000 openings over the country of which, over 92% account for full time roles.

Salary Statistics for Data Engineers:

Data Engineers, like Data Scientists are caught in this pleasant vortex of high-salaried employment opportunities. Despite the fact that Data Engineering as a field isn’t as developed in India as our western counterparts at this time, the salary prospects are surprisingly intriguing.

According to the statistical collections on Glassdoor and Payscale, the average salary of an entry level Data Engineer in India stands at around INR 8 lakh per annum. For a professional at the senior level, this figure shoots up to over INR 16 lakh per annum. These figures are interestingly, quite similar to those of Data Scientists, although they do fall shy by a noticeable but not so appreciable margin. The bottom line here is that for a field that hasn’t yet achieved close to its full potential in India, a Data Engineer could still do very well, the industrial situation is a fertile ground to develop oneself as a Data Engineer in India.

Relevant Courses:

  1. Python
  2. Data Structures
  3. Machine Learning
  4. Scala
  5. Hadoop

Data Analyst:

You can have data without information, but you cannot have information without data

Data Analysis is the process of transforming and modelling data in order to make said data look more presentable and make it more easily interpretable. Data analysis is an integral constituent of decision making in business.

Data Analysts’ work revolves around the comprehensive representation of data and its presentation in a manner that allows for effective and accurate deductions and conclusions in order to help make the right decisions be it business, finance, marketing etc. Data analysts are required in many such fields.

Skills required towards becoming a data analyst:

Advanced Microsoft Excel:

Excel is a very powerful piece of software and advanced knowledge of it is essential for Data Analysts. Microsoft Excel offers a wide variety of tools and functions that help enhance quality of work and so it is essential for you as a data analyst to get acquainted with it.

Experience with SQL:

SQL is a really important programming language in the Data and Analytics sphere as well, not unlike the case with data engineering.

If you can become skilled in this area it will greatly enhance your chances of beating your competition to the role. Whilst this won’t be needed for some Data Analyst roles, it will tip the scales in your favour when it comes to job opportunities.

Experience with programming languages:

As seen and mentioned earlier, data analysis will also require you to get your hands dirty with coding. It is once again suggested to learn Python or R, as they’re predominantly the most frequently used and most preferred. They’re also a mandatory skill that’s required in some positions, but they’re fairly easy to learn. One is advised to get some experience with Matplotlib and Seaborn in particular.

Data Warehousing:

Experience with relational and non-relational database systems is a must. Examples of non-relational database include — Mysql, Oracle, DB2 etc. Examples of non-relational database include — NoSql : Hbase, HDFS, MongoDB, CouchDB, Cassandra, Teradeta, etc.

Computational Frameworks:

A good understanding and familiarity with frameworks such as Apache Spark, Apache Storm, Apache Samza, Apache Flink, MapReduce and Hadoop is essential. These technologies help in Big Data processing as well.

Grasp on mathematical and Statistical concepts:

As mentioned earlier in the case with machine learning, a career in Data Analysis demands one to have proficient and in-depth knowledge of Statistical methods and certain mathematical concepts.

Job Opportunities for Data Analysis:

Data Analysis seems to pale in comparison to Data Science and Data Engineering with regard to the number of job opportunities available in India. However on an absolute scale, the field still does exceedingly well, and is a rewarding line of profession nevertheless.

Platforms such as Indeed, LinkedIn, Times Jobs, Monster, Naukri.com etc. show over 50000 openings over the country of which, over 86% of these opportunities are full time jobs. The number of part-time and contract roles are slightly higher in the case of Data Analysis roles.

Salary Statistics in Data Analysis:

According to the statistical collections on Glassdoor and Payscale, the average salary of an entry level Data Analyst in India stands at around INR 4.5 lakh per annum, while for a Data Analyst at the senior level, this figure rises up to about INR 11 lakh per annum. These figures imply a significant decrease in the overall salary for data analysts in comparison with Data Scientists and Data Engineers.

Relatively, the career of a Data Analyst may not seem as lucrative or attractive as those mentioned earlier, but still presents great opportunities in India on an absolute scale. India as a whole still needs data analysts in large numbers, and their presence and assistance will prove instrumental.

Relevant Courses:

  1. Excel
  2. SQL
  3. Python

Conclusion

Data scientist, data engineer and data analyst all have something in common but are vastly different in terms of skills required to become one of them. Data analyst requires the least skillset among the three with sql and python being the most important. For a data scientist, machine learning is the most important topic to understand and a data engineer must be familiar with big data skills which involves hadoop and spark.

Are you interested in becoming a data analyst, data scientist or a data engineer? Comment below and let us know!

PS: We are giving away free credits for registration on our website which can be used to avail discounts on online courses. So Hurry Up!

Follow us on Medium Facebook LinkedIn Instagram.

--

--