Twitter Data Science Interview Questions — Acing the AI Interview

Last Earnings, Twitter Inc. soared the most since its market debut in 2013 after it posted the first revenue growth in four quarters, driven by improvements to its app and added video content that are persuading advertisers to boost spending on the social network — Bloomberg

Twitter has one of the biggest data sets in the world. It is much different from Facebook from the aspect that Twitter is real time. Twitter data sets are awesome troves of information and provide great insights. Working on some Twitter data set and providing valuable insights can be a good portfolio project to showcase. One can get twitter data here.

At Acing AI, the aim is to help you to get into Data Science and AI. I have profiled some of the best technology companies and written articles about AI interviews at Microsoft, Google, Amazon, LinkedIn, Ebay, Twitter, Walmart, Apple, Facebook, Zillow, Salesforce, Uber, Intel, Adobe Tesla and most recently IBM. This has led to being the top writer in Artificial Intelligence on Medium. The AI interview preparation guides Part 1, Part 2 go over the details which help you ace any AI interview. Acing AI Portfolios helps you to showcase your AI work. Expert interviews and analyses gives you a sneak peak into the lives of AI/Data Science Leaders and analyses of AI tech companies. Now onto the Twitter Data Science Questions article…

Interview Process

The interview process usually consists of phone interview with the hiring manager. On site interviews consists of meeting with Engineers/Data Scientists. The questions are usually algorithmic in nature including some machine learning questions, math/application based questions and one system design question around working on a distributed system to deliver high scale machine learning.

Important Reading
  1. Tips for using the Twitter APIs: Twitter Data Developers Blog
  2. All Twitter Dev Libraries (Including Python): Twitter Developer Utilities
  3. Twitter Data Case Studies: Use Cases to inform business decisions
AI/Data Science Related Questions
  • Given a 2-column file with user codes and counts, retrieve the top-k users based on a score that is a function of the number of times they appear on the file and these counts.
  • Given a list of all followers in format: 123, 345;234, 678;345, 123;…where the first column contains the Id of the follower, and the second one is the Id of who’s followed, find all mutual follows(pair 123, 345 in the example above). Do the same in the case, when this list does not fit into the memory.
  • Design a system to find top 10 twitter hashtags in the most recent 1 min, 10 min, 1 hr…
  • Given Twitter user data, how would you measure engagement?
  • How can you illustrate a tree-based system with a SQL query?
  • How to combine two datasets?
  • What features would you use to build recommendation algorithm for users?
  • What would you change in Twitter App?
  • How would you test if the proposed change is effective or not? (related to previous question)
  • Find the median of a large dataset.
  • If you got the job at Twitter and got access to all of its data what kind of data analysis would you like to perform?
Reflecting on the Questions

Twitter has a list of complex coding questions from a data science perspective. Twitter Data Blog has a collection of great use cases and Github repos which can be useful to do some hands on work on the platform. This will definitely help learn more about the platform and also answer some of the Twitter specific questions. I would strongly encourage checking those out.

Subscribe to our Acing AI newsletter, I promise not to spam and its FREE!

Thanks for reading! 😊 If you enjoyed it, test how many times can you hit 👏 in 5 seconds. It’s great cardio for your fingers AND will help other people see the story.

The sole motivation of this blog article is to learn about Twitter and its AI technologies helping people to get into it. All data is sourced from online public sources. I aim to make this a living document, so any updates and suggested changes can always be included. Please provide relevant feedback.