Maintaining confidence while scaling an Engineering team

I’ve been the CTO of Splice for more than 5 years now, from early prototypes to today’s 100+ employees. When Steve and I started Splice, I never thought we would get this big. What I didn’t know was that one the challenges of being a CTO post product-market fit would be to build internal and external confidence around the engineering group. I’m not talking about making sure we use the right technology, or that we have the best engineers, I’m talking about our ability to function as a critical part of a complex system.

There are a decent amount of books, events, talks and coaches for CEOs but we hear very little about the struggles of being a CTO. I get it, talking about mistakes we made learning on the job isn’t glamorous, it’s actually quite embarrassing and painful to think about. But part of this exercise is therapeutic, the other will hopefully help current or future CTOs. So here is one thing I’ve learned the hard way and was a very valuable lesson:

Set expectations and put together an evaluation framework

This probably sounds painful, unnecessary and something that might slow you down. Yet, I so wish I had done it a few years back, it would have saved me from so many mistakes. Much the same way a test suite guards against regressions, an evaluation framework can make sure we aren’t backsliding on our responsibilities. Writing tests helps us ensure the proper behavior of a system and to avoid regressions. But more importantly, it builds confidence in what we ship and the process around it.

As CTO, one of my responsibilities is to build confidence in the engineering org. We wouldn’t ship code without tests, yet we might ship an entire org without an objective way to define and evaluate if it matches our expectations. One very important detail about this point is that engineering isn’t the sole stakeholder of this initiative. Confidence must be provided to the rest of the organization to keep it healthy. It doesn’t matter if you feel great about your part of the org when the rest of the company doesn’t feel the same. We also owe that to the CEO who has to justify the speed/quality of the org and for that they need to be able to trust the team can deliver or handle hard situations.

To build confidence measure and monitor those 3 areas

  1. Velocity
  2. Quality
  3. Organization Maturity

Velocity

Most startups who grow decently fast, start feeling they are getting slower. New employees, new processes, management and paying back technical debts are all well known reasons for a slower team throughput. But how do we determine if we are slow or fast? Having a baseline and a way to measure is critical to building confidence. I can’t feel stronger about the requirement of having a way to measure the throughput of the team. The metrics you will pick won’t be perfect, they actually probably won’t be great but you need this baseline to help build confidence, detect issues and celebrate progress. The metrics you pick will depend on your development style, from story points of delivered stories to development flow metrics such as the number of deployments per day.

As much as I hated on agile story points for years, even if deeply flawed, this approach is still one of the best compromises I know of. Alternatively, measuring the speed at which code is shipped isn’t a bad idea. Consider the following metrics:

  • the number of pull requests opened per week
  • the number of pull requests merged per week
  • the average time-to-merge (or % of Pull Requests (PRs) merged under a certain threshold),
  • the number of production deploys

Those could give you a sense of the constant throughput of engineering team. If that number stagnates as you hire more, there might be a problem related to a new process in place, lack of investment on the infrastructure or a technical debt that needs to be addressed. However, if it increases too quickly you might have a quality issue. Don’t forget that measuring the speed of a team without evaluating the quality of the work is extremely dangerous.

Quality (confidence in the code base)

As a team, we should aim for a good balance between speed and quality. Aiming at 100% test coverage and perfect APIs, data models and documentation isn’t quite rational when we need to also ship at a decent velocity. However, regressions should be exceptional, churn in our code base should be reduced to a minimum, new team members should quickly find their way in the code base and new experiments should be launched without fear of taking down the entire system. Quality isn’t a goal in an of itself, the confidence in being able to grow and change behaviors in a safe matter is what matters. Some of the metrics to consider:

  • Test coverage ratio (no need for 100% coverage but knowing where you stand helps a lot)
  • % of times a pull requests breaking the build or fail to pass the test suite
  • % of merged vs rejected PRs
  • Number of comments by PR (you don’t want a too low number but you also don’t want too high of number)
  • Number of found bugs (or bug fixes vs feature shipped)
  • “turbulence” churn/complexity ratio in the code base. If the same code area keeps changing over time, it’s usually a symptom of a technical challenge such as a bad abstraction, poor implementation or a tech/product/business misalignment
  • How outdated are the dependencies used in your code base

Technical debt is normal, quality is quite subjective but you can define as a team, a set of objective criteria that help give you a limited perspective on the quality of the team work. It will also help define technical values which are crucial as the team grows. Like velocity, measuring quality is super flawed, but remember that it’s also super important for engineering and for the rest of the company.

Organization maturity

This is the part I was trying to avoid early on as a founder. You remember that dream of running a tiny, flat team of great engineers not needing much maintenance? It might be at the beginning but isn’t sustainable and we owe the team to mature the org as we grow the company. I don’t yet have a good set of metrics since that’s the area I’m still the least comfortable with but here are some ideas:

Hiring / retention

Do we have hiring goals? Salary brackets? A good hiring process, a good on-boarding process? How about voluntary and involuntary departures? Ideally, your people org should support you with that, but in some cases, you are growing the team before you have a people org in place (HR).

Ratio of managers per Individual Contributor (ICs)

This is a simple metric to start measuring: define a target goal of the number of ICs you want to report to a manager and keep measuring. Managers focus on IC growth, quality, retention, and watch for trouble spots in the process. If you have too many ICs per managers, those managers probably won’t be as effective and that decision might cost the org a lot in retention and happiness.

Responsibilities, expectations and performance evaluations

This is one of the things that most growing startups struggle with. It often requires experienced managers focusing on transitioning from a flat organization to clearly defined expectations and performance evaluation. It’s also a costly thing to do and it comes with serious consequences. However, if the individuals on the team don’t know what they are expected to achieve and when they fall short, they can’t perform at their best. Those expectations need to be clear to everyone else in the company. By the way, this is something that I wish I had done for myself much earlier. By not putting that in place for my own role, I failed to lead by example and I would have honestly done my job better with a properly communicated set of responsibilities and performance evaluation.

Collaboration and flexibility with the rest of the company

Engineering is part of the greater picture, if it only operates well within its org, then it is failing the company as a whole. Empathy, adaptability, and creativity when it comes to working with the rest of the growing organization is key to a successful company. It is true that very often, tech startups have a big focus on engineering but that doesn’t make us the center of the company. If we aren’t able to communicate and serve others, we aren’t playing our role in the success of the company. This is a big reason why all the engineering metrics we discussed above are so meaningful for the rest of the company. Building trust, collaboration and being flexible in the way we all work towards the same objective is what make things happen.

One on ones

One on ones are dedicated meetings between two people in the company. This is often an opportunity for a manager and an IC to dig deeper in a safe place about how to become more productive, happier or develop one’s career. It’s also a great way to get to know people, detect patterns and build empathy. One on ones should happen regularly and across disciplines. They don’t have to be formal 1:1s, sometimes they are scheduled lunches once a month or a monthly small group hangout to discuss how things are going.

Written communication, information organization

As the team grows, information can’t be solely shared orally, we need things written down to last and to be efficient. We also need to quickly have access to those things. How does one find access to the latest metrics, how about this weird piece of software that was written by this early stage employee and that very few people understand? How does one find out who’s on vacation and who’s on call? Where and how do I report a bug, how can I contribute a test or a bug fix? Those things are simple when you are 20 or less, they get way more complicated as you grow.

Conclusion

This is a vast topic and I am only scratching the surface, but this is also the kind of information I wish I could send to myself a few years back. I was lacking a framework of measurement and was evaluating the growth of our org from a gut feeling perspective. As an engineer, I should have known that there was a much better way. Note also that not all those metrics need to be driven or improved at all stages. Once you have an evaluation framework in place it’s much easier to talk about where and how to focus our energy and to see the result of your investment. Having this framework in place and evaluating the results together, builds something extremely powerful: confidence. Self confidence and organization confidence is what drives healthy teams to success because they can face more complex problems.