“A Compendium of Ancient and Modern Geography” — The British Library

Where Successful Data Scientists Sit

Clare Corthell
3 min readOct 1, 2015

I often hear data science leaders pose this question:

Where should my data scientists sit?

People generally agree on two ways to organize a team,

  • Centralized: individuals are on a data science team, working with product teams on a per-project basis
  • Distributed: individuals are on different product/engineering teams

The stakes are high for getting this right — unhappy teams result in poor production, employee departure, and exacerbated difficulty recruiting. The conundrum reappears both in coffee meetings with new data science leads, on panels at conferences, and in my meetings with clients.

Neither model has proven inherently superior. LinkedIn has reorganized approximately every 9 months from one model to the other, and smaller companies move between the models as they grow. In practice there seems to be vast indecision — is the centralized or distributed model better?

Tradeoffs

In the centralized model, data scientists all work on one team, more functionally distant from the product. Because they aren’t embedded with the product team, they forego longer-term involvement with product-specific data. This can hamper their insight and intuition about product opportunities. Implicit in this distance is an increased barrier to interaction between the product team and data scientists.

In the distributed model, data scientists work on different teams, embedded with product and engineering. Because they spend their time with people in other functional roles, they collaborate less with other data scientists. Consequently, they duplicate methods, tools, workflows, and specific solutions of data scientists on other teams.

There are benefits and costs to both. If we’re aware of these tradeoffs, why are we racked with so much indecision?

Indecision is a Pattern

As with most data sciencey things, the answer is in the pattern — companies have moved back and forth between the centralized and distributed models over time. Given these two imperfect options, companies have demonstrated that it is functionally superior to expand and contract the data science team over time.

Why expand and contract?

Data solves problems and problems are everywhere. Data scientists need to move toward problems and distribute solutions across the company. Like a lung moving oxygen, data scientists are physical vehicles for distributing insights across the company, leveraging them in both strategy and application.

Put simply, the role of the data scientist is to create access to data insights across the company. Be it predictive features of a risk model or test results for product variations, these insights create value for both the engineering team and the decisionmakers seeking to understand the future possible.

The Pulmonary Model

Embrace the need for insights to move like oxygen (with its platelets, your data scientists) across the company. But you don’t have to reorg every two months. Some teams even use a hybrid structure:

  • Monday, Friday with the data science team (Manager)
  • Tuesday, Wednesday, Thursday with a product/engineering team

Data insights are not only valuable within the product or business team in question, but laterally to other data scientists and upward to the business and its decisionmakers. Insights are oxygen — allow the company to breathe!

Where do your data scientists sit?

--

--

Clare Corthell
Clare Corthell

Written by Clare Corthell

The Arctic, Maps, Data. @clarecorthell

Responses (1)