Going from Zero to Sixty: Building Lyft’s Self-Driving Software Team
By: Anantha Kancherla, VP Engineering, Level 5
When I joined Level 5 eighteen months ago to lead the core software efforts, the team had five software engineers, five hardware engineers, and one challenge: build a self-driving car. Now when I go to work, I enter an office with 300 engineers and a small but growing fleet of test vehicles. While I’m proud of the velocity of our team’s accomplishment, it took a lot more than just speed. We had to be extremely thoughtful about how we built and nurtured our team.
Developing self-driving technology is one of the most challenging engineering problems of our time. While working on 3D graphics at Microsoft and mobile at Facebook, I learned that for any major engineering challenge you need to build a team that matches its scale and magnitude in order to succeed. Then, even if you manage to hire the ideal team, ensuring productivity scales with the growing team is a challenge of its own.
While we’re still accelerating toward manifesting the full potential of autonomous vehicles, we have learned a lot along the journey so far. Here’s how we built our team while simultaneously executing on milestones.
Starting at zero
To catch up with the rapidly moving industry, we had to accomplish two core tasks:
- Build a team capable of developing autonomous vehicles (AVs)
- Start delivering results quickly
Having previously worked at the dawn of other technical waves, these tasks were familiar (yet still challenging as ever). I knew that for this project, the first hurdle would be acquiring the right talent. This is especially challenging since the full problem has yet to be solved, and few people have extensive experience in this field.
So, we began by acquiring a deep understanding of the product’s components in order to define which skills we’d need to bring to the table. Then, we envisioned the mindsets needed to cultivate a team culture that would sustain the project into the future.
We started off our talent search by mapping the relationships between the different skills we’d need to deliver the final product. A Sense, Plan, Act (SPA) framework was a good place to start understanding the structure of skills needed in the self-driving space. This surfaced the obvious need for experts in artificial intelligence, machine learning, and robotics.
However, when we mapped it out we found there’s a lot more to it than that. The skill sets actually covered the entire gamut of computer science, as you can see in the diagram below. In addition to artificial intelligence, machine learning and robotics, we would also need engineers for things like operating systems, safety, auto, graphics, and more.
These domain experts come with completely different mindsets that would ultimately need to deliver one thing: the car. This required agility and alignment.
Agility meant more than writing code quickly. These individuals had to be able to keep up with and adapt to a quickly evolving industry. Alignment meant that we needed people who could willingly support each other since the software is so tightly coupled. Slight disconnects at the software level can manifest into much larger system-wide issues. For example, if someone changes a compiler flag, it can potentially change how the car drives. To remain at a high-velocity long-term, each team member would also need to maintain a strong focus on the platform (e.g. APIs, high code quality) in order to enable others to soundly build on top of their developments in the future.
Gaining speed with a scaling team
With an understanding of our ideal team profile, we set out to grow the Level 5 team. To do this, we:
- Sought out key talent
With most industries, you can start your talent search by looking for individuals who have relevant experience at companies within the same industry. However, since the self-driving is relatively new, finding these individuals is challenging (and when you do find them, they’re often in high demand). We are lucky that Level 5 has such a compelling vision and strategy, which helped us attract great talent. Where we couldn’t find a domain expert to fill a specific role, we sourced talent in neighboring disciplines with overlapping skills and cross-trained them.
- Planted several core engineers
Instead of building one sub-team at a time, we decided to start by planting engineers in each of our core areas who were technically deep, able to understand and frame the problems in the area, and able to come up with ideas of how they should build to solve the problems.
- Grew a balanced team
We bet on diverse profiles and matched them with complementary skill sets. With core engineers in place, we surrounded them with people who were high in experience but low in expertise (i.e. experienced coders who know how to ship great products), and people who were high in expertise but low in experience (i.e. PhD’s who might not have shipped anything yet, but know the latest and greatest engineering techniques).
- Split the growing teams
As these teams and technologies grew, we split them rather than reconfigured them. Our software team became our car and cloud teams, the car team became the platform and autonomy teams, the autonomy team became our planning and perception teams, and so on and so forth. We kept splitting the teams until it matched the structure we had first envisioned to support our mission. This allowed us to maintain consistency for team members who were working on common projects, and keep communications strong where dependencies were highest.
- Launched remote offices
The world holds a lot of widely dispersed talent, so we purposefully broadened our talent search beyond Palo Alto. We quickly launched our Munich office, where we gathered an impressive pool of SLAM (Simultaneous Localization And Mapping) talent. With the acquisition of Blue Vision Labs in London, we also gained a skilled team with unique expertise in Visual SLAM.
Iterating team processes to keep velocity high
Executing an engineering project of this scale requires more than the ideal team structure. You can’t whiteboard with a team of 100 like you can with a team of 10, and an email thread quickly becomes chaos with too many voices jumping in. While this extra noise typically causes teams to slow down as they grow, we had to equip our team to keep delivery velocity high in the midst of intense growth and change.
How did we do this? It’s simple — we embraced it.
Thanks to the foresight from many of us who have succeeded in launching complex products in the past, we were able to predict key areas where pains typically surface and get ahead of them. Based on inflection points that typically arise with team size, we proactively flexed our timelines, structures, meeting frequencies and styles, and communication channels to meet the needs of the team. The numbers below are not an exact science, but I’ve found they serve as a good rule of thumb.
For example, as teams grew, they still did their daily scrums, but we slowly introduced weekly syncs, bi-weekly syncs, and so on.
Defining best practices for a smooth ride
With talented people and a plan to embrace rapid change in place, we needed to lay some ground rules for moving fast while avoiding errors. I drew inspiration on this from my experience training to run a marathon. After some sweat and sore muscles, here are the lessons I learned that are now core to our workflow:
You don’t run a marathon on day one. You first train and build your muscles.
Iteration and realistic milestones are key to making progress. Our very first car was built by just a handful of people, only worked in the parking lot, and only turned right (we called it the Zoolander car). This approach forced us to grow a strong infrastructure first. As the team grew larger, so did our ambition.
A tiny pain at mile 5 can stop you in your tracks on mile 15.
Every time you go through an iteration, learn what didn’t go right and fix it. We had to do this over and over, investing in quick feedback loops and instrumentation in our code and processes early on.
Choose the best route to keep your run impactful.
Carefully consider what you build and what you buy. With various companies investing in self-driving and providing commercial solutions, we had a number of options to choose from. We decided to build things we considered core IP, and were careful to invest only in tools that would help us debug and move our software forward.
You’ll get farther when you focus on safety as much as speed.
If a test vehicle doesn’t work well, it can lead to serious consequences. We grew the “safety muscle” as a part of our culture through each iteration. To maintain an exceptional level of safety, our teams first learned what it took to operate test vehicles safely, and then got them on the roads frequently to do end-to-end testing.
Invest in quality running shoes early on.
Solidify code from the bottom up. Teams tend to get frustrated when the software they depend on changes. While we’d all love to regularly write the perfect long-term software, the nature of this project makes this difficult as breakthroughs happen quickly. To alleviate this, we solidified the lower-level software (closer to the hardware) first and were very aware of what we wrote for the long-term versus where we took the expedient approach to bypass roadblocks.
Fixing bumps in the road
Even with an optimal team, plan for growth, and strategy to execute quickly, avoiding tension is near impossible with rapid rates of change. Here are some pain points we experienced and how we remedied them:
- Engineers can be religious about code style guides. Different engineering backgrounds tend to bring different mindsets as far as how things should be done.
What worked for us: Disagree but commit. We agreed to have the necessary debates, break down the disagreements, make a decision, and keep moving forward together as a team.
- Communications tools get messy. We found that as the team grew, so did our communication channels and project management tools. This led to a scattered workflow, which strained the team.
What worked for us: We created a working group that proposed a solution that eliminated unnecessary tools and defined a standard for how teams should communicate. Dedicating time to solve the problem head-on streamlined our workflow.
- When something goes wrong, it can be hard to pinpoint the cause. With so much going on, it can be easy to circulate stories about the reason something failed, but these can lead to wild goose chases if they’re not factual.
What worked for us: We set a standard of being data-driven in our conversations, and made sure dashboards were in place to help track where hiccups occurred.
- Remote office coordination is challenging. Communication within Palo Alto was hard. You can imagine how introducing new time zones would impact workflow.
What worked for us: First, we got strong site leads in place at our remote offices. Then, we worked to give them independence and ownership of the pieces of the puzzle that they were working on.
Then, last year after 15 months of work, the amazing team built this:
I’m incredibly proud to say that we now have a live employee pilot in Palo Alto where our test vehicles are tackling challenging driving conditions. (They can do a lot more than turn right!)
While we still have ground to cover before our self-driving technology can improve transportation for all of us, these milestones are major steps in getting there. Now that we’re cruising at a fast pace, I can’t wait to share the next one with you.