Data Isn’t Numbers, Its People

Jason Canney
Xandr-Tech
Published in
7 min readSep 28, 2020

A colleague forwarded me an article written a few years back about the then Head of Airbnb’s Data Science’s organizational approach to leveraging Data Science and Analytics more broadly throughout the organization while creating a dynamic team, culture, and data strategy. As the current Head of Data Science, Analytics, and Data Strategy at Xandr, I was intrigued. What did Airbnb do to be successful, and how could we learn from that success?

When I joined Xandr almost two years ago, it was at the beginning of bringing multiple diverse companies together to create a singe integrated culture and team. Conversely, Airbnb had started from a handful of employees and grew exponentially. The two companies were very different in their inceptions. Still, the challenge of optimizing Data Science and Analytics across the entire organization and creating a measurable data strategy was the same.

Which Model to Use?

As described in the article shared with me, most companies will choose one of three models to organize their Data Science, Analytics, and Data Strategy teams so as to optimize business performance. The first is an embedded model where scientists and analysts are part of a team of engineers focused on solving a specific business problem. The second model creates a centralized team of scientists and analysts who are then assigned to particular teams on-demand to solve a specific problem. As discussed below, both of these models have issues and ultimately limit business scale.

In the embedded model, the scientists, analysts, and strategists have no visibility to other scientists, analysts, and strategists and what they are working on. There cannot be a comprehensive view of the business data nor a data strategy. Because each team is compartmentalized and focused on a specific business outcome there is no natural sharing of technology and data. These teams are also limited in their ability to share knowledge, constraining their ability to create the best business solution. Conversely, with a centralized model, the scientists and analysts have a community in which to share information but are limited in their knowledge of the specific business problems. Scientists, analysts and strategists are never close enough to a problem long enough to become an expert on any problem. This lack of proximity to and context for the business problems to be solved is the main shortcoming of the centralized model and ultimately undermines its efficacy. Both models are flawed, which leaves the third option, which is known as a hybrid model.

The Hybrid Model

The hybrid model is what we have implemented at Xandr. In this model, the team leadership is centrally organized, but individual teams work in an embedded structure partnered with engineering and product. The hybrid model enables the already strong centralized team to more effectively share knowledge and solve those problems with the highest benefit to the business. It also provides better visibility across the organization to more precisely define data strategy. The model also naturally creates space for an innovative culture to develop. Specifically, within Xandr, this structure has helped spawn new research projects and partnerships across the organization. The ability to resource and knowledge share across science and analytics teams increases our ability to support more projects with less effort.

How did Xandr leverage the Hybrid Model with embedded teams?

To better understand the benefits of the hybrid model, it helps to move from the abstract to the concrete. What were the key pieces of work that are now centralized, and how do the embedded teams operate?

Centralized, Data Science Organization Processes

1. Hiring.

Data Science is about people, so there is nothing more important than who is on the team. At Xandr, the Data Science and Analytics function owns the hiring and on-boarding process of data scientists and analysts. As data science has been a rapidly evolving discipline, experienced scientists and analysts must be the ones to hire scientists and analysts. What is the right balance of statistical knowledge, engineering skills, and personal attributes? We are still figuring that out. Though engineers and product managers are part of our hiring loops, a centralized hiring process has enabled us to learn what works and what doesn’t when recruiting, interviewing, and making offers to prospective team members.

2. A Data Science Platform

If you have the right people, your data science and analytics team has a chance to be great, but it still cannot succeed without access to data and tools to process that data. At Xandr, we have worked diligently to ensure all data scientists and analysts have the same advanced tools and access to raw and aggregate data. Key to this effort is a centralized data science platform engineering team responsible for coordinating with data and system ops to streamline software package management, data access, and model deployment. Before this team’s existence, each data science project had to negotiate with data engineering and other groups separately to get the data they needed for fundamental analysis and to build new algorithmic solutions. This was frequently a common point of friction that limited the team’s impact. Now, not only do our data scientists have access to Petabyte-scale data warehouses, but we have also built a standards-based platform, tools, and a deployment pipeline to empower the team to scale solutions and deploy to production.

3. Team, Culture, Community, and Unity

The cornerstone of a great team is its culture, community, and unity. Within Xandr, we spend a lot of time working on employee technical skills and career development. As part of the employee development program (nicknamed SHIELD), there are various Tech-n-Tell and Masterclass sessions, as well as guest speakers. Creating collaboration opportunities during COVID was easy since we already had a robust team communication model that enables the team to be spread across various states and countries but still feel like a small, close-knit group. We also judiciously use meetings at an organizational level. These gatherings help create a common identity and culture and provide critical opportunities for knowledge sharing. The meeting formats have stood the test of time for us, including a weekly leadership meeting and a monthly all-hands for the roughly 60 team members. While this seems very tactical, these sessions are essential components of the team culture and support unity even during COVID with scientists and analysts spread across New York City, Portland (OR), Richardson (TX), San Francisco, Quimper (France), London, and Colorado

Embedding in the Hybrid Model

The other half of the hybrid model is for the data scientists and analysts to be embedded with product, engineering, and services at the project level. On a day-to-day basis, most team members are trying to serve one of the many projects that require data science expertise. Team leads tend to work with the same product and engineering leads to assure success. At Xandr, product, engineering and data science align with company capabilities such as identity, optimization for advertisers, and audiences. Data Science and Analytics leads are the organizational glue, understanding the business problem and data capabilities. Though data scientists and analysts do not actually report to engineering or product, they take end-to-end ownership of the projects they work on. Their managers evaluate them based on how well they serve these projects.

Essential Byproducts of the Hybrid Model

With this context on specifically how we implemented the hybrid model, its key benefits can be more easily understood.

While each project presents a unique problem for data scientists to solve, we have developed a playbook for algorithmic development. Specifically, senior/team leads share how they have tackled various steps in the data science life cycle, then all teams can borrow from that collective experience. What we do probably looks like what a lot of other groups do; most successful projects follow a path from fundamental analysis to modeling, simulation, solution proposal and architecture, live testing, algorithmic iteration, and monitoring. One aspect of our playbook that may be unique is that analysts build dashboards and other tools to monitor algorithmic performance that product, services, and engineers can use.

Data Strategy

The hybrid model also helps create a better data strategy. For a comprehensive data strategy, you need visibility to data across the entire organization. Analysts and Scientists are embedded within business units which creates visibility to data and knowledge sharing that plays a crucial role in helping create the data strategy. Data Analytics partners with Product and Strategy teams and is responsible for pulling together the data and tracking the overall business key performance indicators (KPIs). These KPIs allow the business to measure the value of its data across the organization. The scientists and analysts then work together to develop marketplace models and dashboards to identify friction points and opportunities that create a measurable data strategy to drive the business.

The optimization of Data Science, Analytics, and Strategy teams to fully monetize company data and increase the collective team’s performance requires broad data visibility, community/knowledge sharing among scientists and analysts, and proximity to business problems. As shown in the image below, the hybrid model includes all of the Pros and none of the Cons.

“A company’s data, how it’s optimized and applied to business problems, is directly related to the company revenue and performance” (Carruthers, 6)

The organization of Data Science, Analytics, and Data Strategy in a hybrid model most effectively serves the business. Within Xandr, these teams are already being organized and operating within a hybrid model. The Airbnb example is recognition that other successful companies have organized data science and analytics to drive success in progress as well as a KPI driven data strategy. The hybrid organization model empowers scientists and analysts to find the people in the data that improves the overall performance of the business.

Carruthers, C., & Jackson, P. (2018). The chief data officer’s playbook. London: Facet Publishing.

Airbnb Article Referenced

https://venturebeat.com/2015/06/30/how-we-scaled-data-science-to-all-sides-of-airbnb-over-5-years-of-hypergrowth/

--

--

Jason Canney
Xandr-Tech

Data strategist, technologist, people person. I get to work with brilliant people every day