What’s in a name?
The semantics of Science at Lyft
At Lyft, we’re rebranding our Data Analyst function as Data Scientist, and our Data Scientist function as Research Scientist. In this short post, we describe the reasoning behind the change, which we believe will set Lyft up to make better decisions and build better products as we scale.
The data science explosion
It’s no secret that there’s been considerable hype building around the field of Data Science ever since the term entered the popular vernacular circa 2012. (According to Wikipedia, it was first used as a synonym for computer science all the way back in 1960 — and in a context more closely resembling its current meaning in 1997.) And with the hype have sprung many different points of view. You’ve no doubt already seen the hyperbolic articles, the high-profile jabs, the clever tweets. Some have mused about the lexical validity of the term, suggesting that calling oneself a data scientist is akin to identifying as a “hammer carpenter”.
But Data Science is, of course, more than just hype. Companies of all stripes are generating massive amounts of data and leaning on experts to harness it for making better decisions and better products. Over the past half-dozen years, the practical result is twofold: the number of jobs in Data Science has risen by an order of magnitude, and students and professionals from all walks of life are jockeying to get into the field. Along with this rising wave of supply and demand in the data scientist job market, one might have hoped for a gradual crystallization of the semantics of the role itself — convergence to a widely accepted definition of what it means to “do data science” in a technology company.
Instead, just the opposite has happened.
Title ambiguity and proliferation
As it stands, a data scientist may have vastly different responsibilities and qualifications across different companies (and even within the same company). Broadly, such a role may entail one or more of the following:
- data engineering
- data modeling
- metric definition
- building dashboards
- descriptive statistics
- experiment design and analysis
- statistical modeling
- inference
- economics
- optimization
- machine learning
In many cases, these tasks are done at scale with massive, distributed data. Some demand proficiency writing “production level” code.
Clearly, this is far too many distinct technical competencies to group under a single title. While every Data Scientist job posting promises some subset of the above responsibilities, it’s often difficult to know precisely what the role will involve on a day-to-day basis until one is actually working in it. At Indeed, the leading job search site, this ambiguity recently prompted scientists to throw their hands up and declare that there really is no such thing as a data scientist.
Several alternative titles also exist in the industry—either because they pre-date the Data Science Revolution, or as an attempt at differentiation. We surveyed more than a dozen leading tech companies, and here is a partial list:
- data analyst
- business analyst
- business intelligence analyst
- product analyst
- quantitative analyst
- statistician
- economist
- applied scientist
- operations research scientist
- research scientist
- research engineer
- machine learning engineer
- machine learning scientist
- product scientist
Unfortunately, even these alternative titles are inconsistently defined across the different tech companies that advertise them.
From the employer’s point of view, title ambiguity and proliferation make hiring much more challenging. There is frequently a mismatch between the employer’s and the candidate’s expectations about the requirements of a role. Filtering candidate resumes and sourcing new ones is tricky because past or current titles are not necessarily translatable to the role in question. In essence, there is a low signal-to-noise ratio. The result is a paltry conversion rate from candidate application to successful hire. Worse still, offering (what’s perceived to be) an inferior title for the identical job can mean losing a candidate to a competing offer at the last moment.
Rebranding Science at Lyft
Lyft’s Data Science and Data Analytics groups, founded right around the Data Science inflection point, have increasingly felt this pain over the past several years.
Internally, we have maintained a fairly strong semantic distinction between the two roles: analysts extract insights from data, track the health of our business and drive better decision-making; scientists build the mathematical models and algorithms that power the core components of our product. To be sure, the reality on the ground is much fuzzier: in order to build models, one must analyze data; offline analyses sometimes gradually morph into production systems. Both groups spend considerable time thinking about how to design and rigorously analyze live experiments. But overall, the division of labor between Data Science and Data Analytics at Lyft has been relatively clear and easy to navigate.
From a hiring perspective, things have been a little less rosy for all the reasons outlined above. The applicant funnel has low precision for both roles, and the process is rife with inefficiencies. A candidate may be passed between recruiters, chat with multiple hiring managers, or occasionally return for a second onsite loop. More than once, we’ve lost a Data Analytics candidate to a competitor merely because that company offered the Data Scientist title. Even if only a minority of the tech industry has shifted their definition of Data Science towards the business analytics end of the spectrum, the presence of even one or two major players in that group (and there are several) immediately puts us on the back foot.
For this reason, we’re shifting our data analysts over to the Data Scientist title. To maintain a functional distinction between the roles, we’re also rebranding our data scientist job title to Research Scientist. This is merely a label switch, in that the day-to-day responsibilities of both types of scientists at Lyft will not change in any meaningful way. We fully recognize that it will cause some confusion initially — in part because we’ve been eager to talk about the composition of the Data Science team and the exciting work they’re doing. Besides, ours is not really “Research” with a capital “R”. Rather, it can be described as applied research with the goal of product impact, with papers or conference talks being a happy side effect. This post is an attempt to short-circuit that confusion and be transparent about what we hope to accomplish.
We expect this change to result in higher-precision (thus more efficient) hiring funnels for both groups. As we continue to scale the Science organizations at Lyft with world-class talent, we’re excited for the opportunity to make our decisions even more scientific, and our algorithms even more innovative. We’re determined to keep science at the forefront in pursuit of our mission: improving people’s lives with the world’s best transportation.
Are you a budding ridesharing scientist? Interested in geospatial problems, marketplace optimization, machine learning, causal inference, or business analytics? Lyft’s Data Science and Research Science groups are hiring! Drop me an email at chamandy@lyft.com.