How data scientists lead and drive impact at Meta
By Crystal Distin and Jason Wei
The most common question I get at the end of a data science interview is “What do you love about being a data scientist at Meta?”.
I always enjoy answering this question because it reminds me why I continue to love being here after almost 10 years at Meta. The answer is pretty simple: I get to LEAD changes that improve our products for billions of people around the world. Data scientists at Meta are great at analytics and coding, but first and foremost, we push the frontiers on the products we build by challenging assumptions and leading across different disciplines when necessary to make a product successful. This product leadership, regardless of where you are in your career, is the thing that differentiates data science at Meta versus data science at most other companies.
Outside of product leadership, data scientists surface the hard facts and truths that might not be readily apparent. In order to find these truths, it’s very important to be highly critical and highly skeptical of the data that we do see. This mindset helps us go deep enough to surface the true insights versus the surface level insights that may lead us in the wrong direction.
How is data science at Meta different from other companies?
One way that data science at Meta might differ from other companies is that we are fully embedded into the product teams in contrast to being a centralized team.
Centralized data science teams, where data scientists sit with other data scientists and take requests from the business for analysis work, allows data scientists to more easily standardize methodology and flex across different product areas. However, this model also decreases accountability for business outcomes as you jump from project to project.
An embedded model means, for every product team (comprised of a product manager, designer and engineers), there will be a data scientist who works closely with that team. We do this because we hold data scientists accountable to the success and outcomes of that product team. This is true at every layer in the organization, from ICs up through Directors and VPs.
Data scientists at Meta also stand out for the degree of influence we carry in decision making at the company. While there are certainly decisions at the company where data takes a back seat (e.g. theoretical ventures for which we may not have any information on), for the most part, leaders across the company look to data (and our voices) to understand the nuances of our product to then make better decisions. This is just a consequence of the scale at which we operate. When you have products that 3B+ users use across the globe, and those products are highly networked (where one post or interaction can have trickle down effects throughout the whole network), it’s impossible to operate with just our intuitions.
How we advance measurement to drive decisions
As mentioned above, Meta and the products we own are all highly networked. In fact, within each individual app, it’s actually a mix of often competing networked ecosystems. Consider the Facebook App:
- You have the original friends and family ecosystem. You friend someone, which forms a connection. You and your friends each see updates from each other. Your friends can interact with your posts, and when they do, you’re more likely to come back to the app to share more and to interact with them back.
- You have the groups ecosystem. You join a group, which then allows you to see content from that group and interact with its members! Each group will usually have a set of admins and moderators, who invest time into taking care of that group (moderating its content, welcoming new members, hosting real world events, etc).
- You have our discovery ecosystem. As you interact with the app, Facebook learns about your interests and will recommend you content. This could be content from pages/publishers, creators, even public groups!
- You also have a set of utility products, each with their own mini-ecosystems. There’s Marketplace, which is a 2-sided ecosystem of buyers and sellers. There’s Facebook Dating. There’s Facebook Gaming, which has its own ecosystem of 3rd party game developers and users.
- Last but not least, there’s our ads ecosystem, where advertisers use our products to efficiently reach their target audience.
As you can see, there’s a lot going on within just one app! And oftentimes, changes we make to one ecosystem have a large impact on the rest. Consider our People-You-May-Know (PYMK) product, which competes with similar products like Groups-You-Should-Join (GYSJ). As we grow the prevalence of one, it trades off against the prevalence of the other. If a user starts to see less PYMK, they’ll have fewer friends, which means every time they post they’ll receive fewer interactions, which may mean they become less likely to post in the future. But if a user sees less GYSJ, groups may see lower membership, which might discourage admins from continuing to invest in those groups!
So how do we make decisions in light of all this complexity?
This is where data scientists, through years of running experiments and trying to understand these trade-offs, have developed methods and formulas to try and model out these secondary and downstream effects. Within Facebook App, we call this our “Facebook Ecosystem Score”, and there are similar concepts on Instagram, etc. What follows is a high level description of how this score gets constructed.
First, there are multiple types of effects we are trying to measure and model. Two examples are what we call feedback effects (the downstream impact to you and your network of additional comments, replies, messages, etc) and inventory effects (the downstream impact to your network of posts you make, reshares, etc). Depending on the effect we’re trying to measure, we might even utilize different types of experiments. For example, peer encouragement experiments can tell us the effect of receiving additional marginal peer feedback on posts for our users (link). And we also have specialized methodologies, such as cluster experimentation (link), that help us measure highly networked effects.
With such experiments, we can then estimate the downstream effects of the factors mentioned above, such as:
- Having more friends / connections
- Seeing / producing more posts
- Becoming a member of groups
- Interacting with other users
- And so on
Now consider the results of a typical A/B test we run to understand the impact of a potential product change: we can directly observe how often users come back to use the App, build new connections, make posts, give feedback etc. But these changes don’t happen in isolation. In reality, having more friends means having more friend inventory, having more inventory leads to more interactions, more interactions leads to more friend inventory, so on and so forth. By combining these direct observations modeling out the infinite multi-hop effects into a single score, we can use it to estimate the impact to the entire network if the product change is implemented. This approach allows us to make product decisions that optimize for the overall long-term experience of all users, even when the different sub ecosystems are competing and trading off against each other.
And while the above might sound complex enough, consider that in every country, the ways people use Facebook are different! And so certain effects and trade-offs will vary from country to country, from demographic to demographic.
How we surface deep product insights to change and shape product strategy
To inform strategic shifts that take place at the company, we often use a combination of long term trends, descriptive analysis and experimentation. This may not be just one analysis but a series of analyses over the course of many months to more deeply understand a product space and surface robust insights.
One such question is the value of sending a message. Messenger has been a separate app for 10+ years now and largely considered separate from the Facebook ecosystem. The two apps had separate goals that often conflicted with one another and changes made in one app would impact the other in both positive and negative ways. The data science teams across both Facebook and Messenger worked together for almost a year to run experiments that proved the value of messaging to the Facebook ecosystem. This has resulted in joint goals that align both apps on common outcomes. We always understood that content sharing was important to the Facebook ecosystem but subsequent analysis has made it clear that day to day messaging that is not related to Facebook content is just as important to the Facebook ecosystem.
In addition to the importance of this alignment, we also surfaced several large and concrete opportunities for improvement in this space. As a result, both teams are actively working on improving messaging across both apps. This change impacts the work of thousands of engineers across 2 large organizations at Meta!
Hopefully through these two examples, it’s a little clearer how data scientists at Meta help make both short term and long term strategic decisions.
How data science at Meta has evolved over time
As someone who has been at Meta for nearly a decade, the role has evolved a great deal. When I first started, there were only 40 data scientists, supporting some 200 product managers, and 4000 engineers across the company. While we were still technically “embedded” into teams, by nature of having to support multiple product areas, we were limited in both our depth and understanding of each of those areas. Ten years ago, it also was common for engineering teams to have never worked with a data scientist. And so a lot of the job was working with engineering on basic things like logging and experimentation (usually, debugging imbalanced and poorly setup ones!).
Today, data science is 1-to-1 with product managers, which again allows us to go much deeper, and ensures a greater degree of accountability. And as mentioned above, over the years, with each experiment, we’ve built up our understanding of how our ecosystem works, and have been able to increasingly understand and model out these intricacies.
Conclusion
In conclusion, data scientists at Meta are expected to be product, analytics and people leaders. We drive product strategy through robust analytics across measurement, goaling and product insights and we shape the work of many many thousands of engineers across Meta. We hope the examples above provide a little inspiration into the possibilities for data science to inform and shape product strategy in your field as well!