Data at DoorDash: Transparent, Ubiquitous, and Still Just Getting Started
DoorDash tracks hundreds of variables to make sure a customer’s food arrives on-time and fresh, but the impact of data reaches well beyond the product. The Support team uses it to answer questions more efficiently; Recruiting uses it to improve the candidate experience; the People team uses it to track employee satisfaction and engagement. Datasets are available to every team member every hour of the day, and the company works hard to teach everyone — even if they’re non-technical — about how to use data effectively.
“Data is at our core,” says Jessica Lachs, head of business operations and analytics. “It helps inform our thinking, it helps us prioritize, and it is the foundation of our decision-making.” It’s also a key differentiator. On-demand logistics is a competitive business with tight margins, so even tiny efficiency gains can be the difference between success and failure.
“If you join us, you’ll have the opportunity to be an explorer.”
And yet, as deeply as data is embedded in DoorDash’s culture, its data team is just beginning to take flight (yes, that means they’re hiring). “If you join us, you’ll have the opportunity to be an explorer,” Lachs says. “There are still many open questions to answer. You could have a huge impact immediately.”
Currently, data roles at DoorDash sit in one of three disciplines: business operations and analytics, which includes business and product analytics; engineering, which implements data-driven statistical models to solve problems for the company’s three audiences; and data engineering, which oversees things like the data pipeline, data warehousing, and infrastructure. We chatted with members of each team (below) to find out what they’re working on, where they came from, and who they’re looking for.
Rohan Chopra, Engineering Manager and Resident AI Expert
“We are helping everything go as smoothly as possible, from the time an order is placed to its arrival at your doorstep.”
Hi Rohan. You’re working with data, but from the engineering side, right? Tell us about your background.
Rohan: I’ve been at DoorDash for about two years now. I was working on a Master’s in artificial intelligence at Stanford, which touches on data science, and I had every intention of completing that. But then one of my friends — actually, one of the co-founders here — convinced me otherwise. He said, “Hey, why don’t you drop out and come work at DoorDash?” And here I am. It was definitely the right decision.
How has your role changed over the last two years?
I started out doing a lot of everything, to be totally honest. At first I did infrastructure work, some internal tools, that kind of thing. But I was always very excited about the dispatch problems. I worked a lot on our routing algorithm — which is a big optimization problem where the goal is to maximize Dasher efficiency and minimize late deliveries (“Dashers” is what we call our independent delivery partners ). That was incredibly fun. I also worked on some prediction stuff of my own and then, as we hired more engineers, I moved into a manager position.
Are you the main person on the engineering team who’s working with data, or are there several people?
I work with several people who have a data science background. We have healthy discussions about how to solve problems, which I think is super important for data science. Nobody is working in a silo here.
Let’s say I’m a data scientist and I’m interested in DoorDash, what would you want me to know about your team?
Ownership is big. We’re a small team, we’re growing very quickly, and you really get to own problems. The problems usually come from us. We look at the business context, identify a problem, develop a solution, and go implement it. You get to dig into the data yourself. You get to build the models yourself, and also put them into production.
When you own everything end to end, you can have a strong business impact. Plus, the problems we’re solving here affect the real world in a very tangible way. As soon as you make a better model for food prep time, for example, Dashers are more efficient and they make more money. In a small way, you’ve changed a bunch of folks’ lives. I think that’s awesome.
Give us an overview of what your team is working on. What are you responsible for, and what are your goals?
I’m working on the dispatch team right now, which boils down to the execution of deliveries. We are helping everything go as smoothly as possible, from the time an order is placed to its arrival at your doorstep.
We look at data from three sets of users — consumers, Dashers, and merchants — to predict when each step in the delivery process will occur. How do we know what the merchant is going to do? We can’t really speed up their process; all we can do is predict it — using data — and tell the consumer how long it’s going to take.
Can you walk us through the process of fulfilling an order? What data are you collecting at each step?
Sure. When an order is placed by a customer, our first step is to figure out when to place it with the store. That request to the store is the first thing we communicate out, but a lot of factors go into the timing. How long is it going to take the store to prepare the food? How long is it going to take for us to get a Dasher to the store? We don’t want the food to go cold if a Dasher does not arrive immediately, so we don’t want to place the order too early. We obviously don’t want to place it too late, because then it won’t get to the customer in time.
In our model the first step we take is establishing an estimated prep time for the order. To do this we look back. We say, “Okay. This is a burger and shake. How long has this store historically taken to prepare a burger? How about the shake?” Ordering on a Friday at 6PM is very different from a Tuesday at 2PM. We collect and analyze everything.
Our next action is offering the delivery to a Dasher. If we offer the delivery to a Dasher too early, they just wait around. That’s bad for the Dasher, bad for us, and bad for the customer. So it’s very important to get the prep and travel times right. How’s traffic? How’s parking? Maybe they arrive, but there’s a line to pick up the food. All these things add up.
After the Dasher picks up the food, she leaves the store. How far away did she park? We collect data on that. Then the Dasher heads to the customer. More travel, more parking, more waiting. Maybe if the Dasher is delivering in Hollywood Hills, the wait time is higher because it’s a gated community. Maybe the customer is in the backyard and can’t get to the door for five minutes.
We have all kinds of signals in the app as well as location data. It’s all about understanding the stages, identifying signals, and then mapping everything out. For future deliveries, we use that information to predict each component and set the Dasher up for success, and also set expectations for the consumer.
At each of those steps, just for fun, can you give a ballpark of how many factors you’re considering?
It varies for each, but I’d say 20 or more. For just prep time it’s closer to 40. The closer we look the more we add.
Can you talk about a particular problem you’ve solved, or something you’ve improved in this process?
Yeah, I can dig deeper into prep time. One of the biggest draws of DoorDash is our unique selection. You can order from almost any restaurant you want. Prep time is a very difficult problem and it’s important we get it right, otherwise we might lose our selection.
We’ve been collecting data for three years now. Preston, in particular, was able to cut that data in a bunch of different ways, which has resulted in improved prep time estimation. The coolest part about it is that it translates directly to our business metrics. It’s so core to our business because when you understand prep time, you start reducing wait time for Dashers, and then they’re so much happier. The whole system becomes more efficient.
You said earlier that you joined as a Jack of all trades, but now you’re specializing. What about people who join the team today? What’s their path?
We try to allow people to explore. You start with the basics, working on different projects, trying different things out within the team. A lot of people actually move between teams as well. Once you take on a specific project you get more business context — the more you learn, the more you can dig into the other problems we’re facing in Engineering.
The fun really starts when you get to the point where you’re just coming up with your own ideas. At this stage, it doesn’t make sense for someone to just be like, “These are the problems and this is how we solve them. Let’s go.” We really need everyone to weigh in with their ideas. I want to hear what my team members think is a problem. What are we not working on that we should be working on? What are they concerned about?
Interested in joining the team? Say hi: email@example.com
Jessica Lachs, Head of BizOps/Analytics and Former Investment Banker
“Members of this team need to be able to communicate clearly and present a strong business case to people without math or engineering backgrounds.”
First off, just give us a lay of the land. Who’s on your team, and what’s your focus?
Jessica: I lead our Business Operations and Analytics team (BizOps for short.) We are responsible for all things data at DoorDash. Specifically, we have a few areas of focus including business and product analytics, experimentation, data infrastructure, performance management, and machine learning.
Product analytics includes evaluating product changes as well as determining areas for improvement and innovation. When building a product roadmap, it’s critical that we calculate the impact of different product changes so we can prioritize where we dedicate resources. Once we roll out a new feature, we measure the impact and iterate.
On the business side, we’re asking questions like: “What should we be paying Dashers?” “What should we be charging consumers?” “What is the ideal number of Dashers on the road given the demand that we’re forecasting?” “What does customer retention look like?”
Data also drives a culture of experimentation at DoorDash. My team helps design, execute and evaluate these experiments. In partnership with Hendra’s team we recently rolled out a new experimentation framework to ensure that teams can test and validate their ideas easily and rigorously. This is one example of the data infrastructure work we do. Others include contributing to the data pipeline and building specs for tracking.
There’s also performance management where we track how we’re performing as a company and how we’re measuring up to the goals we set for the quarter or year, as well as monitoring metrics for anomalies.
Last, but definitely not least, we use machine learning to solve problems across the focus areas I just mentioned. Examples include predicting customers who are at risk to churn, improving our quoted delivery time estimates, and determining fraudulent transactions. Those are just a few examples, but there are many more problems that can be solved, or at least better understood, with machine learning. There’s a lot of opportunity to have real impact at DoorDash in this area.
That’s a ton of things! If somebody joined the team right now, what would they work on? How do you set priorities?
I’d say that our number one goal is always improving the customer experience. That could mean making sure deliveries arrive on time, making sure that you’re consistently getting exactly what you ordered, making sure we have the best selection of restaurants on the platform, and much more. There are lots of ways to solve these problems, but we believe a lot of these can be tackled through machine learning.
You obviously need people with strong technical skills, but what are the softer skills you find most valuable for doing this work?
We work with people from all different backgrounds. Analytics, and machine learning in particular, can feel like a black box, so members of BizOps need to be able to communicate clearly and present a strong business case to people who may not have math or engineering backgrounds.
The BizOps team at DoorDash is unusual in the sense that we’re driving a lot of initiatives ourselves. Many projects start with a simple question or observation. In that case, we need to do the analysis, make a recommendation, and get buy-in from other people. That’s why strong communication skills are critical, along with a deep understanding of how our work impacts business fundamentals and how changes can be operationalized. This isn’t the type of work where you sit in the corner by yourself all day.
I’m curious why you said “yes” to working here. Why is this an interesting challenge to you?
I’m a former investment banker. I transitioned to tech after founding a social gifting app while getting an MBA from the University of Pennsylvania Wharton School. When I first joined DoorDash two-and-a-half years ago, I wasn’t particularly passionate about food delivery but I was fascinated by logistics. I thought this big optimization problem would be a fun challenge. I took a circuitous route to where I am now and it’s a testament to how DoorDash offers opportunities for people to take on new responsibilities.
I started out as a General Manager, helping to launch two of our early markets. But I gravitated towards problem-solving, asking questions like “What’s going wrong and how do we fix it?” “What’s going well and how to do we double down?” We didn’t have a BizOps team or an analytics team at the time, and Tony, our CEO, suggested I move to California and be a full-time problem-solver. In order to solve problems you have to go to the data, because that’s where the answers are. I don’t have a degree in computer science; I’m self taught in SQL and Python, with the generous support and tutoring from a few of our engineers.
Wow, that’s kind of crazy. Has your roundabout path influenced the way you’ve built your team?
Yes, absolutely. I am trying to build a team with people from different backgrounds and areas of expertise. That way we can teach each other and learn from one another. We are all intellectually curious, and that creates an environment that is collaborative and that fosters learning. Someone who comes in with financial modeling skills can learn SQL and Python. Someone who comes in with a CS degree can learn about the business side and how to build an operating model. Someone who comes in with a stats background can learn about machine learning models. I want to provide my team with a solid foundation. That’s what investment banking and business school gave me.
Interested in joining the team? Say hi: firstname.lastname@example.org
Hendra Tjahayadi, DevOps/Data Infrastructure Manager and Former Lyft Data Architect
“If we collect everything but people can’t search, it’s like finding a needle in a haystack. So my team builds models so people can query the data easily and understand exactly what it means.”
Hey Hendra. Can you start by describing your main responsibilities?
Yes, I’m responsible for DevOps and the data platform. On the data side, it’s about providing data that can be consumed easily by whoever needs to make decisions. I work closely with BizOps, which is led by Jessica, to make sure our data is in good condition and they can trust the results of their analysis.
Our role is to build a view of the whole DoorDash world so people can be productive and make the best decisions without worrying about infrastructure.
On the DevOps side of things, this includes site reliability, developer productivity, and our production quality. We don’t want to slow people down, but we set the balance between moving fast and stability.
Do you have a specific example of a problem you solved?
Basically, when I first joined we didn’t really have a data infrastructure. Everything was queried off our production database. But it’s meant to keep track of customers orders. In order to ask questions and learn about customer behavior, we needed a totally different system.
So I’m leading data warehousing, where we collect data from all over the place. We’re kind of the hub. We’re gathering data from our own database, our support infrastructure, and our apps. We have the complete picture, but once we have that data, the job is only half done. If we collect everything but people can’t search, it’s like finding a needle in a haystack. So my team builds models so people can query the data easily and understand exactly what it means.
If you don’t have this process, my definition of something might be different than yours, so we’re talking a different language. Whereas if we allocate everything correctly, there’s only one definition. I never want anyone to say, “At first I thought it meant this, but my query was actually doing this.” This enables people to share and collaborate.
Do you have an example of a particular impact on the business of what you’ve done?
It’s not really particular. It’s more like all the time. There are people who use the infrastructure everyday as their their full-time jobs. They’re using the data warehouse to do analysis on everything. BizOps is querying the warehouse daily to find insights and make decisions.
From the engineering side, people are using it to find the results of experiments. People are constantly trying new things against our control groups, to see what customers like and don’t like.
Tell me a little bit about your background. What were you doing before Doordash, and what roles have you had here since joining?
I’ve been all around. I started my career at Google. I began as a junior engineer and was there for six years. At first I didn’t know much about software development, but I learned about writing applications and I learned a little bit about everything, including working with big datasets. After that I moved to a small company called DropCam, which was later purchased by Google, so I was back.
I started focusing on data architecture about three years ago. The way I look at data, it’s kind of like vision. If you don’t have it you can still walk, but you’re walking in the dark. In the on-demand economy this is much more important than with consumer products, like Dropcam.
Why were you interested in coming to DoorDash?
I’m interested in the space. I believe customer demand is only going to grow and the problem statement is very challenging. In some ways it’s because we have a very short customer experience, so to speak — delivery takes less than one hour. We also have multiple customer types. For instance, when I was at Lyft we were balancing a driver and passenger. Here we have three sides (customer, Dasher, merchant) so one change has so many effects. We really can’t see without data — the challenge is so huge.
Another thing I like about DoorDash is that the company has so much energy; the people here are super talented. I also have the opportunity to grow and I’m making a big impact in the company.
Why is it a good time to join the team? If someone is considering working here, what do you want them to know?
We’re in a good place to join because we’re growing very quickly. You get to own an entire problem space, and you have the autonomy to drive projects to completion. It’s a lot of responsibility.
This is in the company culture. To me, culture is very important; it determines whether you’ll be happy. With infrastructure we have a culture where your performance is highly visible. When people do well or do poorly, everyone sees. It’s a two-sided coin, and people should be ready for both sides. For some people, it is too much pressure — I’m not scared to talk about that. We need people who don’t get overwhelmed very easily. That said, it’s also rewarding and exciting. I always look forward to going to work and solving problems.
Interested in joining the team? Say hi: email@example.com
Preston Parry, Data Scientist, M.L. Engineer, and Diversity Advocate
“You’re typically coming in and solving a problem from scratch, but you’re not burdened by legacy systems.”
Hi Preston. Let’s start broad. What are your responsibilities, and who are you working with?
Preston: I was the first data scientist hired here at DoorDash and it’s a ton of fun. I am officially on the BizOps team, but right now I’m partnering with engineering, so I spend most of my time with them. Currently I’m working on making sure deliveries arrive on time. There are lots of pieces that go into it and it’s a key optimization point because, if an order is late, it causes problems all the way down the line. The food’s cold, the merchant’s unhappy, and the customer is hangry. We love our customers, so that’s never acceptable.
Did you end up working a lot with the engineering team because that’s what you were interested in, or because that’s where you were needed most?
Both. One of the really cool parts about data science at DoorDash is that there is just so much opportunity and so much need for it. The areas of particular interest for me were also business priorities.
It’s nice when things fit together like that. Tell us a little bit about your background.
I started in analytics, then combined it with engineering to end up in data science. A while back I was a consultant with Nielsen in their joint venture with McKinsey. After that, I lead the analytics team at an analytics startup, and then I ran product and engineering at a software engineering boot camp for people of color and women.
That sounds really cool. Which boot camp is that?
Telegraph Academy. It’s awesome. They’re doing great things. Obviously, diversity is really important to me. That’s the first thing I probed when I got the offer here and I was really, really impressed. Diversity provides us with different perspectives and new ways of looking at things — we’re definitely looking for diverse candidates.
Generally, what kind people are you looking to hire?
We recognize that all data scientists are unique — there’s not yet a standard skill set. It’s still such a poorly defined field that there’s no “ideal data scientist.” When new people come in, they bite off their own small chunk. Don’t be intimidated if you feel like, “Oh man, I don’t understand this giant chunk of the field.” It’s totally fine. We’re looking for people with discrete skill sets within the larger field. That said, the one thing that’s essential is the ability to find meaning in complex datasets. Maybe that’s a given, but I don’t want to assume.
You’ve definitely touched on this some, but why did you say yes to working here in the first place?
For me, it’s ideal. I can’t get over how perfect the setup is. If you join a data science or machine learning team at an organization that is much younger than DoorDash, you’re likely either not going to have enough data to play with or the data is just not going to be clean enough, and you’ll spend your first year building out the data engineering pipeline. That’s really cool, but you’re typically not doing much machine learning at that stage. Here at DoorDash, we are growing like mad. We have a huge amount of data, and we already have the data engineering pipeline all built out by the awesome team that Hendra leads. We have this really fast analytics database where the data is, of course, still messy because real life data is always messy, but you can run machine learning on it right out of the box.
You don’t need to do any of the ETL stuff, but at the same time we’ve only scratched the surface on what machine learning can do — so there are opportunities everywhere. You’re typically coming in and solving a problem from scratch, but you’re not burdened by legacy systems. You can just come in and solve the problem in the best possible way using the most advanced techniques. There’s so much pent up desire from the different teams, and you get to be that expert — you’re not hampered by anything.
Do you think that would be overwhelming for certain kinds of people? Or are there things that make it not overwhelming?
It definitely can be overwhelming for people who just want to come in and dive really deeply into only one corner of an enormous model. Here, we’re definitely looking for more of a generalist, which is why we are more open to people with very different backgrounds.
What will people learn if they join the team? What are they setting themselves up for?
What I love about analytics is it can set you up to do anything. If you love machine learning, you are going to get amazing, pure machine learning experience and that’s super valuable in and of itself. You’re also going to get experience being more of a consultant and a strategic advisor to people internally.
And then, if you’re interested in another field, you could do this for sales, or marketing, or anything. You can do whatever you want because there is such a need for everything.
When we talked to Jessica, she mentioned that anyone at DoorDash can pull any data. Why is that allowed, do you think?
Yea, it’s all available. We want people with different backgrounds and different perspectives to have access to these tools. I want machine learning to be available to everybody, so I built out a library that automates the whole machine learning process. Now anyone with a tiny bit of Python experience can run machine learning. That’s also available as an open source library called auto_ml, and we are using it internally.
It’s really cool because now anyone can make sense of their really complex data sets and, selfishly, I think it’s awesome because now we can iterate on a ton of different projects very rapidly and get even more machine learning code into production.
Finally, what are some hard problems you see on the horizon?
Automating solutions for some of our support requests is one. People get happy when they get answers right away. We’ve got some interesting ideas about how we can get them the right answer as quickly as possible.
There also continues to be important work to be done with our core dispatch algorithm: which Dasher gets offered which order and how we can make the whole process more efficient and cost effective. This requires massive amounts of research and problem solving with real-world data. That’s an awesome problem to be able to work on. Any incremental progress we make translates directly into dollars — for us, for Dashers, and for merchants.
Interested in joining the team? Say hi: firstname.lastname@example.org