Bridging the Gap Between Research and Development

An Onfido Biometrics Team Story

Machine Learning startups often suffer from a chasm between engineering and research. Onfido is no exception. In this story, I’ll take you through the journey of the Biometrics team towards being truly cross-functional.

Symptoms of non-integrated teams

When I first started at Onfido, almost two years ago, the research function was completely separate from the engineering function. It sat outside cross functional teams, had its own leadership and its own objectives.

This lead to various pain points felt across the business:

  • Machine learning researchers felt like they spent loads of time productionising their code, which they weren’t specialists at, or even enjoyed. They had many dependencies with DevOps and other engineers from outside their function, which slowed down their progress.
  • The engineering team complained about the production readiness of the algorithms produced which were often without tests and unable to scale. It didn’t help that backend engineers didn’t really do Python.
  • The business had no visibility of what was coming down the pipe, no understanding of how long a project would take and was generally frustrated with the lack of visible progress.

These are themes that I have seen in other early stage machine learning based product companies, who operate without integrated teams. These tensions can fester and degrade trust between functions, erode any sort of remaining empathy and destroy the reputation of entire functions within the business.

How we integrated the teams

Joining as product manager, I was tasked with overseeing a brand new Biometrics Line of Business. I deployed the processes which I’ve previously used to breakdown barriers of communication and build empathy: cross functional teams and shared objectives.

The team started out as one product manager, one team lead, three Ruby/Elixir developers and one machine learning researcher. While Product and Research were in London, the engineers were in Lisbon.

The evolution of the team. Disappearing faces = promotions, moves to other teams, and end of internships. Fortunate to not have had anyone quit yet 🙏

Relationship building across functions

The first step was to build a relationship with the machine learning researcher, who at this point still considered himself to be a part of the research team, and just happened to work on biometrics algorithms.

I worked with him to understand his vision, the problem space and what excited him. He helped me understand what was possible now, vs what would take a long time to experiment with. We evaluated the market for similar offerings and potential providers, and weighed build vs buy decisions.

We compiled a list of algorithms to explore and prioritised them together, consciously diversifying our portfolio of bets, such that there was a good amount of certain short term algorithms, balanced with bigger, riskier initiatives.

We formed a partnership, just as a PM would with their engineering lead. It helped that this machine learning researcher is quite commercially-minded and customer-centric, but these are traits that can be taught. The important thing is to build the partnership.

Aligning our goals

The whole team wrote quarterly Objectives and Key Results (OKRs) together, which as much as possible were outcome focused rather than output focused. That is to say: “move metric x%”, rather than “ship this feature”.

Outcome focused OKRs allow the engineering and machine learning researchers to work together to achieve a goal, which has measurable business impact, without being prescriptive on how to achieve it. This allowed the researchers to experiment with various algorithms during the quarter, and even if one didn’t work, they could still abandon that one and experiment with another way to solve it.

Each quarter, my objectives are around discovering the needs of a particular market, and defining whether there are any valuable problems to be solved in that space. Sharing these learnings directly with machine learning researchers helped me discover what was feasible and where we could achieve breakthroughs and innovation ahead of the market.

Resolving tension

While writing OKRs together aligned our goals for the quarter, it didn’t completely resolve tensions between engineering and research. By this point, the only machine learning researcher in biometrics had hired multiple people who reported into him and had wanted to create an identity as the Biometrics Research team, further separating themselves from the Biometrics (Engineering) team.

A few things helped bring the teams closer together and eventually lead to the creation of a fully cross functional team:

  • Renaming our “Team Lead” to “Engineering Lead”: We needed to recognise that if we were to merge the teams there could be no one team lead, but a trio of leads per say: product management, engineering lead and research lead. The lead roles denote the line management responsibility as well as the architectural and strategic decision making power in their functions.
  • Socialising together: The two functions of engineering and research were in two different countries, so flying the researchers to Lisbon for a whole week really helped break down barriers in communication and build friendship and empathy between the two functions. It brought us together and people started feeling part of a single team. It also brought us many Pasteis de Nata (Portuguese custard tarts) and tasty portuguese Cozido.
  • Adapting Scrum ceremonies and iterating on process: The nature of work for engineers and researchers is wildly different, and Scrum simply doesn’t cut it.
Team lunch at Gunpowder, London.

Adapted Scrum ceremonies for teams with machine learning researchers

Engineering work is typically well defined and certain. So much so that whole methodologies has been built to help measure and predict the output or velocity of software teams. The most popular in the startup world is agile and its different flavours such as scrum and kanban. While we started on a strict-ish scrum diet, we quickly ran into problems.

In contrast, research work deals with many unknowns. It often starts with a feasibility study to figure out if something is at all realistic and possible. This takes place in multiple experiments and it can take very long to deliver presentable outcomes.

The researcher’s updates were often “my experiment is still running”, or “yup, still reading papers”. If they described in more detail what they were doing the engineers would stare blankly due to lack of machine learning expertise. Their tickets also had very high estimates and kept carrying over multiple sprints. Both of these things frustrated them. They felt they weren’t able to give meaty updates and be proud of their progress.

The researchers would often not understand what the engineers were talking about either. They were less involved and interested in the wider platform architecture into which their models would eventually be integrated.

It got so bad that researchers started skipping stand up because they didn’t find it valuable, further creating team dysfunction.

The changes that helped:

  • Friday recap: Instead of joining the stand up (formally Daily Scrum in scrum) every day researchers would join every other day, and eventually only on Friday, where they would give a longer update on what they had achieved that week. This allowed them more time to experiment and gave them more airtime to describe the approach and the context of their work. Engineers also gave an update on that week’s progress and contextualised their projects and architecture decisions.
  • Stand up Summary on Slack: A the end of every stand up, I write a summary of what has happened and what people are focusing on today. I @mention any one from research where relevant, such as progress on integrating their algorithms or needing their input to unblock the team. This has helped the researchers stay in the loop.
  • Algorithm chat: In a dedicated session each researcher explained what they were working on, how their algorithm worked or didn’t yet, their approach, where they were at with it. It included some basic upskilling for non machine learning people and helped level the playing field and establish some common language.
  • Shared Backlog Refinement and Sprint Planning: This is not a change per se. It’s important to highlight that the whole team joined during backlog refinement sessions and sprint planning, as it helped align goals for the sprint, linked them to our OKRs and co-create a path from algorithm research completion to productionising it, staged roll out and going live for everyone.
  • Unestimated Research Tickets: We found that estimates for research tasks didn’t actually help us predict when the work would be done. We decided to drop points entirely for researchers, but keep the tickets in the sprint as a way to spark conversations during Friday Recap.
  • Hire the bridge: A key hire for the team was our Python engineer, who bridged the gap between the researchers’ Python code, and our Ruby and Elixir back-end engineers. The role was instrumental in defining how we go from academic type code, to production grade scalable code.
Celebrating successes, even when we are remote. Link to tweet.

Closing comments

Today the Biometrics Team is as cohesive as ever. We since have welcomed two new functions to our team: Service Management/Data Analysis in London and our Test Engineer in Lisbon has started supporting us full time, rather than being shared with other teams.

We celebrate successes together over slack and video conferencing, congratulating each other for great work and learning from our less successful projects as a team. Product visits Lisbon once a quarter. Research and Service go to Lisbon every six months. Engineering and Test come to London at least twice a year. We keep hanging out, learning from each other and iterating on our processes.

What a fun journey it’s been so far!

Stand up over Zoom. Someone must have said something funny. 🤓

You can read a software developer view of this story by Daniel Serrano (3 min read), written about a year ago, so not all of the above changes had been implemented by then.

PS: Finding it hilarious how I’ve gone through four different haircuts in all these photos. 💅