66DaysOfData challenge -DataScience Interview questions-Day27

Matt Chang
4 min readOct 1, 2023

--

Greetings everyone!👋

Sorry for postponing the article for a while. I’ve been really busy with my upcoming Master’s data science degree. I will be sharing all of what I learned here. Alright, enough for the talk. Let me introduce myself and get ready for today’s interview question.

My name is Matt, and I used to be a teacher with no technical background. However, I decided to transition into a career as a data engineer. I used to work as a Satellite Data Engineer at FCU (Feng Chia University) for a while. Now, I’ve put myself out of my comfort zone again to get a Master's data science degree in the UK (There will definitely be more of my data science journey coming in my future articles).

This is what has motivated me to accept the 66DaysOfData challenge.

I’ve recently come across this video about how to build up a habit of learning data science. I was inspired by the author, Ken Jee, and the author of 5 Tips to Make Data Engineering a Marathon, Not a SprintTim Webster, and have decided to take on this challenge. I aim to post three to four random data science interview questions from Stratascratch every week. All the questions will be coming from big tech companies like FAANG. I will be utilizing AI tools to enhance my learning speed on specific topics.

Right! Before diving into the Day 27 question, make sure you’ve done Day 26.

LET’S DIVE IN!

Photo by Jenny Hill on Unsplash

Company: Meta/Facebook

Question type: System Design

Question level: Medium

Job Title: Data Scientist / ML Engineer

Question:

How would you compare the relative performance of two different backend engines for automated generation of Meta/Facebook “Friend” suggestions?

Suggested answer:

Comparing the relative performance of two different backend engines for generating Meta/Facebook “Friend” suggestions requires a multi-faceted approach. Here’s a structured way to compare their performance:

Objective Definition:

Before diving into comparison, clearly define what “better performance” means. Is it faster processing times? More relevant friend suggestions? Higher user engagement with the suggestions?

Technical Performance Metrics:

  1. Latency: Measure the time it takes for each backend engine to generate and deliver friend suggestions once triggered. A faster engine can provide a better user experience.
  2. Scalability: Test how each system scales with increasing data. For instance, see how each system performs when the number of users doubles or triples.
  3. Resource Utilization: Track CPU, memory, and storage usage. An efficient system uses fewer resources, which can translate to cost savings.
  4. Reliability: Measure the uptime of each system and how often they encounter errors or crashes.

Quality of Suggestions:

  1. Relevance: Organize user surveys or focus groups to determine which engine provides more relevant and meaningful friend suggestions.
  2. Diversity: If promoting diverse connections is a goal, evaluate which engine offers a broader range of friend suggestions across different user groups.
  3. Novelty: Users might appreciate new and unexpected, yet relevant, suggestions. Measure how often users get suggestions outside of their immediate network.

User Engagement Metrics:

  1. Acceptance Rate: Track the ratio of suggested friends that users actually send friend requests to.
  2. Click-Through Rate (CTR): Measure how often users click on a friend suggestion to view the profile.
  3. Ignore Rate: Track how often users dismiss or ignore a friend's suggestion.
  4. Feedback: Some platforms allow users to provide feedback on suggestions. Collect and compare this data.
  5. A/B Testing: If feasible, conduct A/B tests where one group of users gets friend suggestions from Engine A and another group gets suggestions from Engine B. Track engagement metrics and gather feedback.

Integration and Maintenance:

  1. Compatibility: Assess how well each engine integrates with the existing infrastructure.
  2. Maintainability: Determine the ease of maintaining each system, considering updates, patches, and potential changes.
  3. Flexibility: Evaluate how easy it is to modify or customize each engine based on evolving needs or feedback.

Cost:

  1. Operational Costs: Compare the cost of running each engine, including hardware, cloud costs, licensing, etc.
  2. Development Costs: Factor in the costs of developing, integrating, and maintaining each engine.

Security and Privacy:

  1. Assess the security features of each engine, especially if they’re handling sensitive user data.
  2. Ensure that friend suggestions don’t inadvertently expose information that a user would prefer to keep private.
  3. Compliance: Ensure both engines comply with privacy regulations like GDPR, CCPA, etc.

Feedback from Product Teams and Engineers:

  1. Ease of Development: Gather feedback from the development team on the ease of integrating and working with each engine.
  2. Documentation and Support: Quality documentation and robust community or vendor support can be invaluable in resolving issues.

By evaluating the two engines across these diverse criteria, you’ll get a holistic view of their performance. Remember, the “best” engine might not excel in every category but should align most closely with the platform’s objectives and constraints.

Feel free to drop me a question or comment below.

Cheers, happy learning. I will see you tomorrow.

The data journey is not a sprint but a marathon.

Medium: MattYuChang

LinkedIn: matt-chang

Facebook: Taichung English Meetup

(I created this group four years ago for people who want to hone their English skills. Events are held regularly by our awesome hosts every week. Follow the FB group link for more information!)

--

--