Data Scientist vs Data Analyst — A primer

Jonathan Kurniawan
Data Analytics @ Hult
3 min readMay 31, 2018

So you’re new to the data industry and want to know more and dive in. You’re talking to practitioners, and ‘Data Scientist’ and ‘Data Analyst’ keeps coming up. At first, you think they are interchangeable. Then, you thought “Hmm, looks like Data Analyst is a more junior Data Scientist”. Then more roles like ‘Business Analyst’ and ‘Data Engineer’ comes up and you don’t know what each role is anymore.

Not to worry, this article will cover some general differences each role has. A note that some responsibilities might crossover depending on specific companies or industries, but in general, they are different.

Data Analyst

‌A data analyst creates models and dashboards that describe the data, also known as descriptive statistics. These include averages, percentages, rankings, and so on. They focus on communication of data, and provide fast, “back-of-the-envelope” answers. The tools they use are Business Intelligence (BI) tools like Tableau, Power BI, and Chartio, and work with SQL. An analyst would work closely with the product team in empowering product managers to make data-driven decisions on what to build. They also help decide how to perform measurements.

Business Analyst

This term is commonly interchangeable with a data analyst. In some organizations, they also drive business decisions and have a broader scope. With their big picture focus, they often act as a bridge between business and tech roles.

Data Scientist

A data scientist creates models that predict the data, or predictive statistics. These include machine learning models like regression models and classification models (e.g. neural networks). They tend to provide precise answers by tweaking parameters and fine-tuning the accuracy of a model. They mostly work with Python, R etc. to create these models. A data scientist can work as a separate team, or integrate with engineering to build products.

Machine Learning Engineer

Similar to a data scientist, a machine learning engineer creates machine learning models. They focus more on building the core machine learning components with the support of the data science team.

Data Engineer

A data engineer creates the pipelines, storage, and ‘plumbing’ that connects the different data sources to enable data scientists and analyst to do their job. They prepare and transform the data to be accessible for analysis. This includes working with data cleaning pipelines, data warehouses (which stores aggregated structured data for a company), ETLs (extracts, transforms and loads the data into an application), and BI tools. A data engineer will often spend time on A/B testing and benchmark the infrastructure.

Data Architect

Closely related to Data Engineer, they also work on the management of data. More specifically, they work with their extensive database knowledge and how data is acquired in the business operations. They are in charge of the data architecture and infrastructure.

Data Wrangler

Sometimes known as Data Munger, they work to transform raw data into a format where it can be useful downstream to a data analyst. Their job is to make unstructured data into more structured data. They commonly use Python, R, and SQL. They work in the ETL layer where they extract the data in a raw form from the data source, sifting through it with algorithms and parsing the data into predefined structures, and provide the processed data either to someone who can work with analyzing the data, or store in a data warehouse.

Now that you know the differences, you’re ready to talk to practitioners in the data world!

--

--