The 3 Data Scientists

The Analyst, the Automator, the Augmentator


Why 3 Data Scientists

Data scientists are one of the most sought after professionals, someone even defined them sexy. For sure there has been a lot of talk about them, and even quite some confusion.

Drew Conway made a very good job in defining them back in 2013, as someone with hacking skills, math and stats knowledge, and substantive expertise.

Drew Conway definition still holds but, as data scientists become more widespread, the jobs done under the hat vary significantly.

I think a further segmentation of Data Scientists, based on what they actually do, could be beneficial, both to new data scientists and recruiters looking out for them.

In my experience, there are 3 types of Data Scientists:

  • The Analyst
  • The Automator
  • The Augmentator

The Analyst

Analysts find insights in data.

Photo credit: Caleb George

The Analyst may as well be a former Business Analyst, proficient in Excel, who heard that Data Scientists earn more and changed his job description accordingly. On the other side, the Analyst can be someone with machine learning/statistical skills, coupled with some programming and database proficiency, able to work around and find meanings in complex datasets.

The output of an Analyst is often a set of insights that are used by management to take decisions. Analysts can also produce outputs that fuel activities, for example finding a list of customers which are likely to respond to a promotion.

Analysts are the more widespread Data Scientist and they will stay so, almost every company in every industry may need one.

In the long standing debate whether it’s more important for Data Scientist to have industry specific experience vs technical skills, Analysts are the type of Data Scientist where IMHO industry specific experience pays off the most.

Example

An Analyst may be someone who works at a retailer on loyalty card data, finding relevant customer segments, giving hints on product mix, forecasting stock needs, etc.

Key capabilities

  • Able to present results
  • Understand the driving forces of the business
  • Analytical skill
  • May know the fanciest algorithms, even though a good knowledge of simple techniques in many cases can be enough

The Automator

Automators automate human tasks.

Photo credit: Fabricio Marques

The Automator develops machine learning algorithms that work in production environments. Automators can be software engineers with machine learning knowledge or machine learning practitioners with programming skills. They may be required a broad set of skills (ex. web servers, databases, basic sysadmin) or be specialists in a larger team.

Their knowledge of machine learning is more focused on algorithms that scale well and/or have small response time.

They are more likely to be found in companies with Big Data such as web companies (ex. ads targeting, e-commerce), financial services (ex. real time fraud detection, trading), etc.

Example

An Automator may be someone who develops a recommender system for an e-commerce website, like Amazon’s “You may also like”.

Key capabilities

  • Programming skills
  • Algorithms that scale well, ex. linear classifiers, or algorithms that require most of the computation at training time, ex. deep learning
  • May need to know Big Data frameworks, such as Hadoop or Spark
  • Able to setup experiments to improve algorithms without talking directly with the user (ex. A/B testing)

The Augmentator

Augmentators augment people capabilities, improving their performance.

Photo credit: Vladimir Kudinov

The Augmentator develops intelligent systems that are used by people in their daily tasks.

Augmentators are similar to Automators in term of skills, but they also need to have better people skills to understand their users. Since their output affects people daily routines, they must find solutions that facilitate behavioural change and create user trust (ex. algorithms that explain the reason behind the output). Also, it’s harder to measure the efficacy of the intelligent systems since it depends on how users use the outputs, thus direct interaction with the users is fundamental to assess and improve.

Augmentators are the smallest group of Data Scientist. Their demand will rise in the future, pushed by traditional industries looking to increase the productivity of the workforce.

Example

An Augmentator may be someone who develops a recommender system that is used by shop assistants in physical retail stores.

Key capabilities

  • Most of what applies to Automator applies to the Augmentator as well
  • May have less data, thus use different software frameworks
  • People skills
  • Closer to the business, more cross functional approach

So, which kind of data scientist are you? What is your view on this topic?