On machine learning team composition

Ruurtjan Pul
Sep 30, 2019 · 6 min read

Getting machine learning off the ground requires many skills and capabilities. Some of these skills are related, some are not. For example, knowledge of math and knowing when to use which machine learning (ML) algorithm share many commonalities, but they are largely unrelated to building infrastructure. That’s why people with ML skills tend to also be competent in math, but not in building infra. Skills cluster, and it’s useful to give these clusters names.

Some of the skills and capabilities required for machine learning

Cluster one: Data scientist

Cluster two: Data engineer

Cluster three: Site reliability engineer

Cluster four: Analytics translator

There are multiple ways of composing teams and dividing responsibilities among them. The most straight forward composition is to group people with the same cluster of skills in the same team.

Expertise oriented teams do not work

Expertise oriented teams

When creating teams of people with similar skill sets, the tasks that each team does naturally follows. The data science team, usually called the data lab, validates ideas and builds prototypes. The product team consists of software and data engineers. They build a software product that contains a ML model. The data engineering team builds data pipelines, and the operations team deploys and maintains everything.

This is quite problematic, as this results in split ownership and many hand-overs, each with severe loss of knowledge. Lead time also drastically increases, because each hand-over requires another team to refine and plan their parts. On top of all this, it creates tight coupling between teams, which results in many single points of failure.

Feature teams do not scale

In traditional software engineering, we’ve seen this problem once before. We used to have a front-end team, a back-end team and a database administrators team. Any change to the system required all teams to coordinate. With the lean and agile movements we shifted towards feature teams. Each team is now end-to-end responsible for a feature, and has people from all disciplines.

There are some issues with applying this approach to machine learning teams though. In addition to the front-end engineer, back-end engineer, user experience expert, designer and site reliability engineer of a typical team, you now also need a data scientist, data engineer and an analytics translator. You might even need multiple people with the same role in a team. This leads to communication issues within the team, as each person added to a team adds more communication overhead than the previous one. The inherent complexity of ML prevents us from recruiting full-stack engineers — a measure to reduce the team size.

This might be slowing the team down, but at least not as much as the expertise oriented approach. If your organization does not yet do ML, start with a feature team.

When you’re further along, and are using ML in multiple teams, you’ll start to see patterns. There are some tasks that get repeated by each team. For example, every team requires monitoring for their ML models, and a tool to schedule batch jobs. With many teams doing the same mundane tasks over and over, it’s becoming economical to handle these cross cutting concerns centrally.

Centralized teams should handle cross cutting concerns

When moving some part of the work from feature teams to a centralized team, we should be thoughtful not to introduce the downsides we’ve seen with expertise oriented teams. We need some principles to protect ourselves.

Principle one: expose products, not tasks

When a centralized team takes over a tasks from the feature team, it becomes a bottleneck. People don’t scale, but products do. So instead, expose a product or platform that enables teams to do their tasks themselves, in a more efficient and effective way.

Principle two: opt-in

When feature teams are required to use the product of a centralized team, the incentive to create a good products is reduced. A feature team might also need something slightly different from the standardized product, which leaves them stuck waiting for the centralized team to improve their product. So give teams the freedom to choose whether or not to use the centralized team’s product.

Principle three: self-service

To prevent hand-overs, make the centralized team’s product fully self-service. Provide rich documentation, and tools or API’s to automate everything that previously required communication.

Principle four: require no communication

That brings us to the last principle: require no communication. Don’t get me wrong, by all means be approachable, helpful, and gather feedback to make a product that fits the feature teams’ needs. But if you find them communicating with you in order to understand or use your product, you’re probably not fully applying the other principles yet. Do everything you can to prevent the necessity of communication between the feature teams and the centralized team.

The services cloud providers offer adhere to these principles very strictly. They expose a product, that you can use. There are no hand-overs, and there’s no need to communicate with the team that builds it.

In short: start with feature teams, extract cross cutting concerns to centralized teams only when they become evident, and think of centralized teams as internal cloud providers.

About the author

Further reading

bigdatarepublic

DATA SCIENCE | BIG DATA ENGINEERING | BIG DATA…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store