The perfect MLOps team

Raphaël Hoogvliets
Marvelous MLOps
Published in
9 min readAug 16, 2023

“MLOps? Yeah, yeah we’re doing that now. Meghan is our MLOps girl, she’s running that stuff.” As data scientists and MLOps engineers we would all like to have a fully fledged MLOps team that is perfectly integrated with the teams surrounding it. So what does a good MLOps team look then? As a kid I would spent hours and hours with my friends doodling the perfect 11 man team for football. Romario here, Bergkamp there. And I have to admit to you, I still do it today for NBA basketball. Me and my friends have heated discussions in our whatsapp group who deserves what position in “the best team of all time” or what positional system they should play. In this article I will try to the same for MLOps. There won’t be a selection of greatest engineers and product managers ever. However, we will discuss the positions and system needed. Let me know in the comments what you think! Heated discussions are welcome :)

But first: why you should want a good MLOps team

Unfortunately, in many organisations MLOps still is dependent on only a handful of people, sometimes even just 1 or 2! In larger organisations where ML is done at scale, the teams grow in size. There might even be multiple teams dealing with machine learning operations, because having the machine learning solutions up and running is crucial to the business’ core operations. They are, as we say, mission critical. In the world of data, where we have come from building ETL pipelines with data engineering to the current world of ML and AI, this requires a bit of a mindshift. When I started out in the field a failed data pipeline would mean “the dashboard didn’t refresh”. We could usually debug some code in a fairly stress-free setting. And if we didn’t manage to fix it by the end of the day, there was always tomorrow.

In the current age of machine learning and ML, where pipelines often are directly tied into business operations, the stakes have been upped. A failed ML pipeline might mean the website might go down, e-mails can’t be sent, products can’t be delivered, large workforces at distribution centers are sitting idle. It might lead to waste, loss of profit, stress. And it the worst cases: safety, security and health risks.

Data flows are hardly ever perfect, ML solutions are often complex and can be prone to bugs. And overal in our industry we have a lot to learn in software engineering, testing code and building robust IT solutions. So shit is going to hit the fan, sooner or later. Shit is pretty much guaranteed to hit the fan… the question is, how bad will it be, how often will it happen, and how will you be able to fix it at an acceptable lead time? The answer is: having a good MLOps team in place. Do not let it all rest on the shoulders of our girl Meghan (she might need a holiday too at some point, can your business keep on running without her even?).

So what does the perfect MLOps team look like?

Strategic/executive level

Every succesful ML project starts with buy-in from the strategic and/or executive level. If you don’t have it, don’t bother trying (fight me in the comments ;), because you will not be succesful in the long run. So having a sponsor and maybe even direct report at the highest level is key to the succes of your project(s). This is because data science and MLOps are often cross-departmental, require significant investments and tend to fundamentally change (parts) of “How we do things around here”. These transformational objectives require support from the big bosses. So treat them as an (extended) part of your team with regular updates and short feedback loops. Getting the green light for a ML/MLOps project and then delivering output months later won’t cut it.

Evangelist

So how did you get to the strategic/executive level? That’s right, your team had a great evangelist. The evangelists are often forgotten, so that is why I am putting them high on this list. Data scientists and engineers have wonderful ideas, but it is the evangelists that get them traction and make sure resources are allocated to the teams. So they are important at the start, but also in the middle and after a project. When things get hard during the execution phase, the evangelist makes sure life support is not cut and prevents the project from flatlining. After the project is delivered, they are the enthousiasts that fan on adoption! And together with technical experts from the data side they can be a great one-two punch constantly educating the endusers on something that might seem like a black box at first. This is not a specific role you would hire for, your typical evangelist could hide under any type of job title, but most likely they have some of that Steve Ballmer type of energy.

Outside-to-inside person

This role is the linking pin between the technical team and the business stakeholders. The person executing it should have a primary focus on the business! Keeping business stakeholders and sponsors informed and happy will ensure sustainable appreciation and impact of ML and MLOps initiatives. Succesful people in this role will make great slidedecks, have a stakeholder matrix by their bedside, are skilled meeting strategists, and will be having many coffees and meetings around the organisation to keep all the noses pointed in the same direction. A key priority for this role should be having clear project milestones (strategic roadmap) supported by the right people in the organisation (stakeholder management). The outside-to-inside is ideally a person from a business background with high likeability. The risk with this role is that a lot of tech people get promoted into it and eventually get frustrated with slowness and stickiness of organisations “they just don’t get it”. Beware of disgruntled developer syndrome, patience is a key skill here! This role could be executed by either one person or a bigger team. In different organisations the role will have different names, any of these or even others:

  • product manager
  • business consultant
  • program manager
  • analytics translator
  • business process owner / business process specialist

Inside-to-outside person

Where the outside-to-inside person is the linking pin from the business side, the inside-to-outside person does the same coming from the side of the technical team. Together they can be a dynamic duo which ensures the often mentioned “gap between business and tech” is closed. Many organisations make the mistake that they combine these two roles into one role. Something that is too complex and has an internal conflict of interest by design. The inside-to-outside person should be the champion and shepherd of the development team, pushing back on business wishes and pressures, while the outside-to-inside person is the business’ best friend. Key priorities should be prioritising and refining work for the development team on the micro-level (backlog) and managing and updating developer epics on the macro-level (technical roadmap). This role often pops up under the name of product owner or project manager, but is sometimes also a part of a wider ML manager role. Lately there seems to be a shift towards a more hands-on product ownership, where the inside-to-outside person is still also coding and reviewing. This is a great and necessary change! But only possible if there is good support by an outside-to-inside person. The inside-to-outside person often gets one of the following names:

  • product owner
  • project manager
  • data science manager
  • requirements engineer

Tech lead

In an ideal world the tech lead is the guru with the respect of all the devs, the experienced senior or principal engineer that has seen it all and knows it all. These gurus are a rare breed though, and there is a shortage of good seniors in the market. The lead should be experienced, yes, but also important here I think is that the tech lead is a strong communicator and culture setter. The tech lead should know the capabilities, hopes and dreams of their team well. Together with the people working on the horizontal levels (for example chapter leads and learning & development managers) the tech lead would be wise to create a culture of continuous learning. Dev culture can be meritocratic, and this can be fine, as long as everyone builds each other up. So “Pffft, this guy didn’t even know about X in the cli” is a red flag and alternatively “Owww, but you could also just fix this from the cli, let me show you if you want!” is the green flag you want. I believe this starts at the top! If your tech leads adopt this attitude, while also always being open to learning new things from others, your team will flourish. It starts at the top. Key responsibilities: solution design, quality assurance, culture setting, standard operating procedures, team learning & development, problem solving. Ideally the tech lead also has business acumen, but the position can be executed without it if the tech lead has strong support on the wings by the other roles.

The engineers

In an MLOps teams you might work with different types of engineers. We will discuss the different flavours of engineers, their skillsets and responsibilities in a future article. So stay tuned!

  • MLOps engineers
  • ML engineers
  • Infrastructure / Cloud / DevOps engineers
  • BI / ETL / Analytics engineers
  • Architects

Middle Management (is dead)

I don’t believe in middle management for data science and MLOps. The outside-to-inside, the inside-to-outside and the tech lead are the leaders here. Data science solutions can have such a profound impact on business operations and strategic approach that I believe teams should work close to or under the strategic and executive level. Of course this does not always scale well in larger organisations and an extra hierarchy of direct reporting might be needed to keep the governance taxonomy workable. However, I believe when doing tech in a non-tech company you will want to keep the expertise close the the strategic and executive levels to ensure more streamlined decision making. Getting clout with the big bosses will not only lead to more efficient processes, but it will also empower your employees to be more autonomous and responsible, leading to higher job satisfaction and greater motivation to excel. And from a boardroom perspective: CEO’s wanting to “transform their company with AI” better start talking to their experts. There are a lot of meaningless slidedecks out there. Be ready to challenge your C-suite to invest time, blood, sweat and tears into the “AI revolution”, because it’s coming.

Beyond the labels

These roles will have different names in different organisations. It is important to be able to see through the fancy job titles and have a grasp of what different roles there are in our industry. The exact titles do not matter that much, though I would always advocate for transparency and understandability!

A common pitfall is that many organisations try to add multiple roles together. I have seen people in the role of Data science manager / Lead developer / Product owner, and yes, I too have been guilty of this. From my experience it does not work well in the long run. It’s better to split up some of the responsibilities if scale allows it. Of course in small organisations not all roles will be present and 1 product owner, 2 data scientists and a valuable business case will be all you need / can afford!

Also a very important part of team building are the role-talent fits of your team. And getting the team is just the first step. You need a delivery model, setup processes, ensure team performance and happiness, and then some. We will discuss these in future articles. Happy team building!

Follow us on Substack for these and different types of content. Follow us on LinkedIn for regular updates.

--

--

Raphaël Hoogvliets
Marvelous MLOps

Building data science and MLOps teams // fostering great culture