Is this person Data Scientist or Data Analyst?

Valeriy Babushkin
3 min readNov 18, 2022

--

Recently I had a discussion at my company, Blockchain.com, about the difference between Data scientists and Data Analysts. Even though I gave a talk named — Why you would never find a Data Scientist, and I believe that Data Scientist is a terrible title, due to demand from my team, I had to keep this job title at Blockchain.com.

If we can try to describe a variety of job titles in three dimensions (Domain-Dev-Math), it will be something like that (we don’t have all possible roles here, though)

Dirichlet distribution of data jobs

In reality, we have even more skills and dimensions; thus, we use the following to be more precise.

  • Tech Lead — guides the approach and execution of a particular team. They partner closely with a single manager, but sometimes they partner with two or three managers within a focused area. They’re comfortable scoping complex tasks, coordinating their team towards solving them, and unblocking them along the way. Tech Leads often carry the team’s context and maintain many of the essential cross-team and cross-functional relationships necessary for the team’s success. They’re a close partner to the team’s product manager and the first person called when the roadmap needs to be shuffled. They write code and do code reviews. (Basically taken from Staff Engineer Book)
  • Data Steward is responsible for ensuring the quality and fitness of the organization’s data assets, including the metadata for those data assets: in our case, it includes projects like Attribution, Data Quality, Data Dictionary, Hierarchy of metrics, Metrics reconciliation, Segment and Amplitude ownership (working with end users and creating products for them). This also involves collaboration with the CS team, things like Automate Prime Users Tagging or working on specific integrations (Zen desk)
  • Data Engineer -> people who specify in writing high-quality production-ready code. They are responsible for the enhancement and maintenance of Data Lake, ETL pipelines, Data Infrastructure (Segment/Amplitude), migration of databases and tables, keeping production applications, and providing infrastructure for Machine Learning. A lot of coding + SQL + domain knowledge about data is needed.
  • Data Analyst -> very diverse role; I’ll provide some archetypes.

Person closely working with product managers and stakeholders, answering questions about what happened, why this happened, what will happen and what we can do to prevent/improve/achieve; this usually requires visualization/storytelling/presentation skills.

A person wrangling the data for finance is also a data analyst.

SQL + minor to moderate coding + knowledge of math/stats + heavy domain knowledge

  • I review Data scientists’ role as a final step in the evolution of Data Analysts — all-around people. Data Scientists can do everything that DA can, but better: better coding, more profound domain knowledge (in a specific field), and they might have an additional focus on modelling (better stats/math as results). Modelling is just a tool that helps to answer the question of what will happen/why something happened and helps to find dependencies using a more complex set of self-derived rules.

--

--