Meta Data Science, Analytics Screening: Interview Cheat Sheet

Priscilla Mannuel
8 min readNov 1, 2023

--

Photo by Julio Lopez on Unsplash

Are you preparing for the Meta Data Scientist interview?

Hi, I’m Priscilla — a data scientist, previously at Yelp, now at TikTok. I participated in Meta Data Science, Analytics interview process in the past recruiting cycle.

Once you pass the recruiter phone screening, the first technical screen is designed to test your technical aptitude in 45 minutes. This is the critical round that will determine whether or not you move forward to an onsite.

In this article, I will provide a handy cheat sheet with frameworks and examples for your upcoming Data Science, Analytics screening.

The structure of the interview

The technical screen at Meta is usually conducted on Coderpad, or a virtual pad where the interviewer will assess your coding and product sense ability.

  • Format: Video conference call
  • Duration: 45 minutes
  • Interviewer: Senior/Staff DS
  • Questions: Programming, Research Design, Data Analysis, Determining Goals and Success Metrics

Part 1. Programming (SQL)

Your interviewer will assess your ability to develop solutions to complex data problems using either Python or SQL. I chose SQL.

Example Question: Given a user activity table, find the number of users interactions since the user first log in to Facebook.

Table: Activity
| user_id | date | activity |
|---------|--------------|---------------|
| 1 | 2023-01-01 | 'Login' |
| 1 | 2023-01-02 | 'Comment' |
| 2 | 2023-01-01 | 'Login' |
| 2 | 2023-01-02 | 'Share' |
| 3 | 2023-01-03 | 'Login' |
| 3 | 2023-01-03 | 'Like' |
| 3 | 2023-01-03 | 'Comment' |
| 4 | 2023-01-04 | 'Login' |
| 4 | 2023-01-05 | 'Login' |

Framework for Answering

  1. Demonstrate business understandingUser engagement is crucial to measuring Meta ecosystem health and sense of community. Interactions data is a proxy measure for user engagement.
  2. State what information you are trying to drawWe want to find the number of interactions per user
  3. Explore dataset and ask clarifying questionsWhat are all the unique type of activities? If a user stays logged in, is there a log for each day a user is logged in?
  4. Code your solution — Make sure to talk through your solution and format it for readability. I find creating CTEs allow me to naturally explain my thought process clearly.
WITH first_login AS (
SELECT user_id,
MIN(date) AS first_login_date
FROM Activity
WHERE activity = 'Login'
GROUP BY user_id
)
SELECT f.user_id,
f.first_login_date,
COUNT(DISTINCT ua.activityType) AS num_interactions
FROM first_login f
JOIN Activity a
ON f.userID = a.user_id
AND a.date >= f.first_login_date
GROUP BY f.user_id, f.first_login_date
ORDER BY f.user_id;

🍄 Reference: Here’s a video of a SQL mock interview from StrataSratch

Part 2 and 3 belongs to the product-sense round. It is designed to screen your technical skills on AB testing, metric-sense, and product analytics.

Part 2. Data Analysis, Goals and Success Metrics

This section evaluates if you can structure an analysis plan to address a vague question. Your ability to identify metrics that reflect operational success and inform business objective is also tested.

Example Question: Facebook team is creating a notification that notifies a user when their Facebook Market listing is about to expire. How would you measure the quality of the notification?

Framework for Answering

  1. Display product understanding — “Facebook marketplace allows user to sell items on Facebook. Facebook would like to remind a user of an expiring listing since it will help users in managing listing and ensuring the quality and relevance.”
  2. State the goal of the feature “The objective of the feature is to get users to update their listings or close irrelevant ones.”
  3. Scope impact“Quality is vaguely define. A quality notification will be relevant, timely and does not spam the user. A quality notification will lead to feature success”
  4. Select and define key metrics — Quantify quality into measurable metrics. Write out the definition in Coderpad. (See code block below)
  5. Acknowledge the existence of segments — “The notification experience will vary by type of users. For example, a power user will receive far more notifications than the average user. The definition of notification success will also vary by function. Example, system notifications may have smaller open rate than a friend request notification.”
  6. Generate data for analysis — “We can collect notifications response data for updating a listing following the notifications. We can also see how Facebook Marketplace and overall activity of users who received the notification vary to those who didn’t”
  7. Translate statistic and ML concepts to practical applications“We can generate insight from visualization or estimate impact of notification on user session (proxy for churn) with a regression analysis.”
  8. Interpretation of results — “If notification p-value is significant and coefficient is negative, it suggests that having a notification has a negative impact on user engagement.”
  9. Touch on trade-off — Facebook has a wide range of products. Consider cannibalization and how a users whole experience on Facebook is impacted by the feature change.
## 3. Scope impact
A 'high quality' notification will have the following feature

Relevant - personalized to the users preference, activity and interests.
Timely - the timeliness of notification will affect conversion
Not a spam - maintain user trust and prevent churn from bad experience

## 4. Select and define key metrics
We can group metrics by quality and performance measure.

Quality:
1. Click-Through Rate (CTR): Evaluates if a user is engaging with the app
2. Conversion Rate: Beyond clicking, investigate is user follows through
4. Churn Rate: How many users churn due to the notification
5. Sessions Per User: Evaluates if improvement in facebook marketplace
contribute to overall improvement in facebook engagement

Performace:
1. Notifications Per User: Set threshold for what is considered spam
2. % of User Open Rate: See the coverage of the notification

## 5. Success vary by user segments and notification types
User segment: power user, new user, demographics
Notification type: delivery channel, design

## 6. Generate data for analysis
Is_Notified: Binary variable indicating whether the user receive
the notification
Response_Type: Categorical variable indicating if a user edit, renew or
delete a listing

## 7.1 Regression analysis
To investigate if the notification causes a user to churn, we can use
sessions number as proxy for churn - the lower the session the higher the churn

Number_of_Sessions ~ Intercept + C1 * Is_Notified + C2 * Con-founder

## 7.2 Visualize
Plot a week over week retention heatmap to visualize user retention.
This format will help account for variation in time since users
may receive notification at different times.

Follow-up Question: Be prepared for follow-up questions that build upon the case study question.

  • How will you investigate a drop in friend request metric due to the notification?“A drop in friend requests per user may be due to an uncontrollable factor such as seasonality … a negative effect of notifications, such as spam that lead to churn … or compensated by positive improvement. For example, a user user Facebook less as social network site but more as a marketplace.”
  • How will you improve notification design?“We can investigate the historical notification open rate to determine the optimum time to notify the users of an expired listing.”

🍄 Reference: This is one of my favorite mock interview from Jay Feng.

Part 3. Experiment Design

Your interviewer will assess your ability answer strategic business questions using an experiment.

Facebook is a social network platform, hence it is important to account for network effect when designing an AB experiment at Facebook.

🍄 Reference: Understand network effects through this LinkedIn article

Example Question: “Facebook is adding Reels on the Newsfeed. How would you measure the effectiveness of this feature in an experiment?”

Framework for AB Testing

  1. Display product understanding — “Reels is a short video format that can now be viewed on the newsfeed. The goal of adding Reels to Newsfeed is to increase user engagement and richness of content.”
  2. State the goal of the AB“The objective of the AB is to analyze whether the feature have a negative or positive impact to make a ship decision.”
  3. Select and define key metrics — State metrics and classify them into primary, secondary and guardrail metrics for the experiment. Write out the definition in Coderpad. (See code block below)
  4. Set Minimum Detectable Effect (MDE)— Use data analysis to generate hypotheses and the value of the metrics we should target to offset cost.
  5. Determine sample size — Determine sample size needed to achieve industry standard of 80% and p-val of 0.05.
  6. Consider challenges of network effect — We need to segment users to statistically comparable clusters for experimentation to minimize network effect.
  7. State other pitfalls of AB — Seaonality, novelty effect, learning effect, cannibalization, and spill-over effect.
  8. Set cohort and run experiment— One example solution is ego-network randomization. Note that limited number of cluster units may lead to lower power. Understand trade-off between experiment speed and confidence.
  9. Interpretation of results “A positive and statistically significant increase in key metrics means that the experiment is a success. However, it is important to note that AB measures the short term effect and further analysis is required to understand long term effect and what drives the change in metrics.”
## Business Metric: Measures overall business contribution. Improvement in 
## Facebook earns money through ads and the health of engagement plays a
## critical role in ad sense.

1. GMV: Revenue from ads
2. DAU: Platform engagement


## Primary metric:The main metric to consider for feature success.
## We want to drive higher engagement to generate more ads revenue.

1. Newsfeed Duration Per User: The total time a user spent in a given period

## Secondary metrics: The metrics help explain the shift in primary metrics as
## well as provide furter understanding of impact

1. Click Through Rate (CTR): If a user engage with a reel on the newsfeel
2. Reels Engagement Per User: Clicks, comments, likes, share on reels
3. Post Engagement Per User: Understand trade-off to existing content engagement
4. Reels Viewed Per User: Number of reels per user in a given period

## Guardrail metric: Safeguard the users from a negative experience and platform
##failure due to engineering (e.g. bug)

1. Time to Render: The time in millisecond it takes for the reels to render
2. % of Reels / All Content: We want to ensure a diverse content on newsfeed
3. Sessions Per User: Overall platform engagement should be preserved

Follow-up Question: Be prepared to answer more questions once the AB experiment is completed.

  • “How will you address the different user segments?”
  • “What period will you use to define the metrics?” — Daily, weekly or monthly. In this case, we can use weekly since a user may browse less on certain days.
  • “Given more time to evaluate feature effectiveness, how will you spend the extra time?”

Good luck for your interview!

  • Play around with Meta products to build your product-sense!
  • Follow me on LinkedIn for more data science guides and updates.

--

--