A Data Analyst vs. A Data Scientist vs. A Data Engineer: What you need to know

Glory Adebowale
Nur: The She Code Africa Blog
6 min readFeb 15, 2020
Photo by Miguel Ángel Sanz on Unsplash

This article is designed to give you adequate information about three main careers in Big data. These careers include data analysis, data engineering, and data science.

Kindly note that although this article would discuss the attributes of these careers as recognized by the general public, these factors may differ based on the company hiring which can be due to this company’s model and size.

A Data Analyst

Photo by David Edkins on Unsplash

To extract information, a data analyst engages in data inspection, cleaning, transformation and modelling. These data activities are termed data analysis. After successfully analyzing data, a data analyst should be able to effectively communicate the result of the analyzed data with his team. One of the significant differences between data analysts and data scientists is that data analysts don’t have access to all the data and this restricts their work function. This is because the data analyst is given the problems to solve and all he needs is a short-term thinking process of developing a quick action to these problems. A data scientist is however expected to discover the problems, taking the long term relevance of the company into account and hence given access to all the of the company’s data. Data analysts often take up job titles as it corresponds to the department in a company like business analyst, business intelligence analyst, operations analyst or database analyst. A data analyst should be skilled in data visualization techniques, summary and inferential statistics, presentation skills and communication skills. Some tools used by data analysts include SQL, Microsoft excel and python.

RESPONSIBILITIES OF A DATA ANALYST

Basic data analyst responsibilities include:

  1. Analyzing data using descriptive statistics.
  2. Using database querying languages to retrieve and manipulate information.
  3. Performing data filtering, cleaning and early-stage transformation.
  4. Communicating results with team using data visualization.
  5. Working with the management team to understand business requirements.
  6. Performing exploratory Data Analysis

SKILLS AND TOOLS OF A DATA ANALYST

  1. Should possess strong mathematical aptitude.
  2. Should possess problem-solving aptitude.
  3. Should possess strong communication skills.
  4. Should possess analytical skills.
  5. I should be well versed in Excel, oracle, and SQL.

DATA SCIENTIST

Photo by Denise Jans on Unsplash

Data Scientists analyze data to gain future insights that could propel a company. What sets data scientists apart from data analysts is machine learning algorithms. These algorithms are what data scientists use to predict future events. Since machine learning algorithms require adequate data to run accurately, data scientists are provided with these. They are expected to formulate problems independently, figure out their solutions and determine the most viable ones that the company needs. They are also expected to be curiously minded to ask the right questions. Data scientists should have in-depth knowledge in statistics, maths, computational programming and data operations. This is why a lot of data scientists tend to be PhD or masters holders or researchers and with more than 5years of experience. While it seems like every company should have a data scientist, most companies in fact only need a data analyst.

RESPONSIBILITIES OF A DATA SCIENTIST

Basic responsibilities of a data scientist include:

  1. Performs data preprocessing that involves data transformations and data cleansing.
  2. Understands the company’s requirements, business models.
  3. Curious to formulate independently appropriate questions for problems especially future ones to be addressed.
  4. Uses machine learning tools to recognize and classify patterns in the data.
  5. Develops operational models.
  6. Uses visualizations techniques coupled with storytelling skills to communicate results with the company.

SKILLS AND TOOLS OF A DATA SCIENTIST

  1. Should possess a high depth of knowledge in Math and Statistics.
  2. Should be skilled at handling structured & unstructured information.
  3. Should be proficient in machine learning algorithms.
  4. Should be experienced in handling data mining techniques.
  5. Should be proficient in programming tools like Python, SAS, and R.

DATA ENGINEER

Photo by Hanson Lu on Unsplash

Data Engineering has risen to overtake data science as the most demanded job. This is because, according to this blog, for every data scientist there should be two data engineers in supply. This is because data scientists need data engineers to build pipelines that would deploy the models they develop. In fact, it is claimed that only 13% of data scientists’ models make it into production, and one of the underlying reasons is due to their inability to build the production pipeline that would run the model. Data engineers build platforms and architecture for data processing and large databases. A data engineer is familiar with core programming concepts and algorithms. Some of the tools used by the data engineer include Hadoop, apache-spark, Kubernetes, Java, and yarn

The data engineer builds and optimizes a platform that ensures accurate data for data scientists and analysts to work with.

RESPONSIBILITIES OF A DATA ENGINEER

Basic responsibilities of a data engineer include:

  1. Development, construction, maintenance and optimization of data architectures.
  2. Conducting testing on large scale data platforms.
  3. Handling raw and unstructured data.
  4. Provide recommendation for data improvement, data quality, and data efficiency.
  5. Build infrastructure necessary for optimal extraction, transformation, and loading of big data.
  6. Assist data scientist in optimizing products.

SKILLS AND TOOLS OF A DATA ENGINEER

  1. Should possess a high depth of knowledge in Operating Systems.
  2. Should be experienced in working with advanced SQL and NOSQL tools like PostgreSQL and MongoDB.
  3. Should be experienced in working with cloud-based data solutions.
  4. Should be skilled in bash scripting and JavaScript.
  5. Should be proficient in programming tools like Python and Java.

SIMILARITIES BETWEEN A DATA ANALYST, DATA SCIENTIST, AND DATA ENGINEER

Although their roles are different, they have some common traits like experience in handling structured data and communication skills to work effectively in a team.

CONTRAST BETWEEN A DATA ANALYST, DATA SCIENTIST, AND DATA ENGINEER

  1. A data analyst analyses data to make short term decisions for his company, a data scientist would give future insights based on raw data while a data engineer develops and maintains data pipelines.
  2. A data analyst uses a lot of visualization to summarize and describe data, a data scientist uses more of machine learning to predict the future, while a data engineer uses programming concepts and algorithms to develop and maintain data pipelines.
  3. A data analyst only has to work with structured data while a data scientist and data engineer has to work with structured and unstructured data
  4. A data analyst and data scientist should be proficient at data visualization techniques while a data engineer doesn’t have to be.
  5. A data engineer should be well-versed in the knowledge of the development of applications and APIs while a data analyst and data scientist don’t have to be.
  6. A data analyst focuses on understanding data from the past and present perspectives, while the data scientists focus on producing reliable predictions for the future.

CONCLUSION

Photo by Daniel Eledut on Unsplash

I guess the best way to end this article is to state the earnings of these jobs. According to payscale, the average earnings of a data analyst is $59,946, for a data scientist is $96,106 and for a Data Engineer is $91,605.

--

--

Glory Adebowale
Nur: The She Code Africa Blog

I seek to write what I see in my head and the emotions it sparks…