Academia to Data Science

By Avneesh Saluja, Alok Gupta, Cuky Perez

Introduction

When you’re in graduate school, it seems like the only career option available is to remain in the ivory tower. And it’s reasonable to see why — your advisor and peers are very likely to encourage you to follow their chosen career path. Indeed, the selection bias is strong amongst those who surround you. And when you are a professor, you believe that the only job where you can expand the knowledge base, teach, and mentor others, is within the academic setting.

However, taking a quick glance at the Ph.D. labor market shows that the number of doctorate-holders we produce annually exceeds the number of positions available. Also, Silicon Valley (and the tech industry in general) has been an appealing destination for many former academics and those with a research bent, but the leap from academia to industry is not an easy one. Michael Li documents the mindset shift required in one of his recent blog posts, and frames this shift within the context of delivering business-impactful results quickly (in industry) compared to delivering perfect results (in academia).

While we agree with this basic trade-off, at Airbnb we feel that the mindset shift is slightly more nuanced. In this post, we first discuss the skills (both hard and soft) we look for in candidates hoping to transition from academia to Airbnb, followed by specific pieces of advice for those looking to move into the fast-paced startup world of Silicon Valley.

4 things we look for in Data Scientists from Academia

Data Science is very much an overloaded term these days for all things data-related at technology and startup companies. It sits at the intersection of Mathematics/Statistics, business domain knowledge, and ‘hacking’. Data Scientists are asked to extract insights from data to drive a company’s metrics. At Airbnb this can mean munging data to inform which experiment to launch next or building a machine learning model to optimize our user experience. At Airbnb, when considering candidates coming out of advanced courses at graduate school, in addition to technical attributes and alignment with our core values, we think about 4 attributes:

1. Beginner’s Mindset

With your advanced level of education, we expect that you are likely top of your field and very successful at everything you have touched so far academically. However, academic success and experience does not necessarily translate to industry success. We hope candidates are level headed and mindful of what they do not yet know about the business. Airbnb has a strong culture of mentorship and personal growth — getting here is not the end of the journey, there is still a lot to learn. We look for people who are eager to learn more and have an open mind to expand their skill set outside of their area of expertise.

2. Self Starter

We expect that a PhD candidate has learnt the art of self management. More than anything, graduate school should teach a student how to direct and prioritize their own learning. A senior researcher typically learns how to see a dead end approaching earlier, and quickly pivots their energy into a likely more fruitful direction. Research in a competitive field also provides opportunity to challenge peers and push back on assumptions. In Data Science at Airbnb we expect this to translate into not accepting the status quo but pushing the boundaries of our assumptions.

3. Talking the Walk

Sometimes we see senior academics underperform in their communication. Airbnb is a very collaborative environment with a Data Scientist typically working with other Data Scientists but also with Engineers, Designers, Product Managers, and non-technical people. Being good with data is important, but at Airbnb we need the insights to be well translated to all audiences — from a Data Scientist on your team to the CEO — otherwise the recommendations might not have the impact they merit. In both written and verbal communication, articulation of insights, methods, and assumptions must be crisp, convincing, and sympathetic to the audience.

4. It’s a Sprint not a Marathon

It can take years to publish an article in academia but in industry the turnaround is much faster. That’s not to say the quality is poorer, it’s just that at Airbnb we expect to get a first version of a data product out as soon as possible and then continue to iterate where potential improvement is likely. Throughout our Data Science interview process at Airbnb we are looking for entrepreneurial spirit and candidates that can get past needing a perfect solution before shipping a data product or sharing an insight.

4 tips for Academics interested in Data Science

An academic’s personal experience when transitioning away from the ivory tower will vary significantly based on the field they are coming from. Given the tighter coupling in subject matter between industry and academia these days (especially within fields like computer science and applied economics), the lag between a research idea and integration into an end product is often on the order of months now as opposed to years. Thus, those who make the leap often possess the “hard skills” required for the job, such as strong programming and scientific computing knowledge. However, it is often harder to learn the softer skills and adapt one’s mindset towards industry. We break down the mindset shift into 4 broad areas:

1. Sweat the Small Stuff

Academic research does an excellent job of abstracting the core problem from the messy details that often surround that problem in the real world. Sometimes we are provided sets of training data that are nicely curated and cleaned, and evaluation is performed on a well-documented and benchmarked test set that many others have evaluated. Some thought may go into additional data cleaning, but much of the information contained in the data, especially the training labels, are a given. Other times we may collect our own data but from a tightly controlled field or laboratory experiment where we can minimize data contamination.

Unfortunately, this is not the case in industry. A lot more thought and creativity has to go into how to set up the problem in the first place. Are the labels that we extract or derive from logged data the signals that we need to solve our problem? Are there any bugs in the instrumentation? Are we even logging the information we need? Data Scientists need to understand the problem domain deeply and use that information to transform both the problem and the data to be able to produce something meaningful. Getting things to work in these fast-moving environments often requires as much, if not more creativity, as an abstract research problem.

2. The 80% Rule

Following up on that note, in academia we are often focused on a fairly different regime of the problem. Given quality training and test data a priori, we are often tasked with investigating novel, state-of-the-art solutions to improve performance from an already strong level to something even stronger. On the flip side, the task in industry is to deploy a model where none has ever existed before. There is nothing to compare against, and translating intrinsic evaluation metrics (e.g., AUC) to business impact is challenging at best and perilous at worst.

Given these realities, it’s advisable in these situations to deploy a model that isn’t perfect, but is “80% of the way there”. Optimizing for the remaining 20% of performance oftentimes reflects not only a lack of prioritization (“premature optimization”) but may also simply be not possible given the disconnect between intrinsic and extrinsic evaluation metrics.

3. The Knowledge-Impact Sweet Spot

While Michael mentioned in his post that it’s more important to deliver bottom-line impact than disseminate knowledge, at Airbnb we feel that the ultimate goal is to achieve a balance between these two extrema. Many of us were motivated to enter academia in the first place because we genuinely enjoy producing and disseminating knowledge, and within an industry setting there is significant value in effectively spreading these nuggets of knowledge and not have people reinvent the wheel. To this end, Airbnb has built the knowledge repository, which we recently open-sourced. The repository is our in-house peer-reviewed publications forum, and Data Scientists are encouraged to participate as actively as they can. We also have weekly seminars where Data Scientists or other leaders in the field can present their work, and mentorship is encouraged throughout the company.

4. Being Proactive

A successful researcher is one who knows not only the solutions to difficult problems, but also the right questions to ask. Asking the right questions requires a proactive attitude — instead of expecting to be handed a problem to work on, you need to identify the opportunities and craft a direction of inquiry accordingly. Proactive inquiry is a skill that is well-developed in academia, and is one of the reasons Airbnb encourages top academics to apply.

To rehash a well-known quote: “it doesn’t make sense to hire smart people and tell them what to do; we hire smart people so they can tell us what to do”. We hire people from all kinds of academic backgrounds and qualifications precisely because these different vantage points provide a unique array of tools to tackle the kinds of interesting and challenging problems we face at Airbnb.

If you are interested in learning more about Data Science opportunities at Airbnb then you can visit our careers page here.


Check out all of our open source projects over at airbnb.io and follow us on Twitter: @AirbnbEng + @AirbnbData