Models for integrating data science teams within organizations

A comparative analysis

Pardis Noorzad
Jul 31, 2019 · 16 min read
Image for post
Image for post
At our inaugural DS Crit meeting.

The center-of-excellence model

We start with the most centralized of all other models. In the center-of-excellence (CoE) model, also known as the research model, the expectation is that the data science team works independently to identify big bets and build prototypes. Under this model, the data science team is considered to be the company’s innovation arm.

Some misconceptions

There are some misconceptions that lead organizations to choosing the CoE model for their data science team.

Drawbacks

There are important drawbacks to having the data science team operate within the CoE model:

Benefits and success scenarios

It should be noted that the CoE model works for many types of teams. Centralization helps focus and agency. You should centralize that which you can clearly encapsulate from the rest of the organization. Centralization works when coupling is low and joint meetings are few and far between.

Accounting model

In the accounting model, also known as the BI model, the data science team produces reports and presentations on a recurring basis (usually monthly and quarterly). The data science team would inform the organization of notable movements in top-level metrics. Once the team identifies an interesting or worrying trend, they would work with product teams to investigate the root cause. Thus, quite frequently, playing detective becomes a main activity of the data science team under the accounting model.

Drawbacks

There are three main drawbacks to this model:

Benefits

Reporting on quarterly trends of company metrics is valuable practice. The centralized aspect of the BI team allows for a holistic view of the SBU, thereby leading to decisions leading to global optimizations that can balance and correct local decisions. This work is something that the data science team should be tackling as their charter, regardless of the model under practice.

The consultant model

In the consultant model, the data science team is assigned tickets or emailed with questions. Data science managers then prioritize the tickets and questions and assign them to data scientists.

Benefits

In this model, the data science manager overrides any existing data science roadmaps to prioritize the questions and needs of stakeholders. Due to the symmetrical treatment of all members of the team, this model makes managing a data science team easy and cheap.

Drawbacks

There are many drawbacks with this model:

The embedded model

In this model, product teams hire their own data scientists. Each engineering manager is in charge of planning for data scientist headcount, hiring, and allocation. The data scientist within each product team has the engineering team members as their peers.

Benefits

This model brings welcome independence to the teams and relieves the SBU of the management requirements of a centralized data science team. It solves problems with team sizing and communications. It also solves the ownership and motivation issues that exist in fully decentralized models.

Drawbacks

While there are reductions in data science management cost, this model has important drawbacks:

The democratic model

In this model, it is believed that easy and straightforward access to data by product managers, designers, engineering managers, and engineers would lessen or remove the need for a data science role. Many identify the need for data scientists to be due to the lack of proper infrastructure for fast and easy dashboard creation.

Benefits

It is valuable to invest in data infrastructure and tooling that makes data access, processing, and visualization available to everyone. This investment is particularly valuable to data scientists as it frees up time for proactive opportunity sizing, experiment design, metric design, model design, and general improvements in methodology.

Drawbacks

While ensuring everyone has direct and easy access to data is a noble goal, there are some drawbacks to this model:

The product data science model

Between the extremes of the fully centralized model (the CoE model) and the fully decentralized model (the embedded model), there exists a spectrum of hybrid models that take characteristics from each of the aforementioned models. Taking advantage of the strengths of both models, while actively making up for their deficiencies is what makes hybrid models successful.

Benefits

a. Clear ownership and actionable insights. One important benefit of the PDS model is clear ownership of projects by the data scientists, due to their membership in the various product teams. Membership in each product team gives data scientists a thorough understanding of that product, its limits, and its potential. This in turn allows a straightforward mapping of analysis to proposals for action. It is difficult to move fast if newly available insight does not map into reasonable and informed actions.

Drawbacks

No model is perfect and each have their drawbacks. To quote Sinofsky,

Conclusion

Where a single product is under development, I recommend the PDS model as the best in efficiency and effectiveness in leveraging data for the business.

References

[1] Functional versus Unit Organizations by Steven Sinofsky
[2] Building Data Science Teams by dj patil
[3] Where should you put your data scientists by Daniel Tunkelang
[4] How to play well with others by Josh Wills

Acknowledgements

Thanks to Raki Wane, Peter Skomoroch, Sayan Sanyal, Parham Noorzad, Josh Montague, Chris Albon, Josh Silverman, and Harish Krishnan for reviewing and providing valuable feedback.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store