Spotify Data Science Interview Questions

200 Million Active Users consume music on Spotify every month.

Vimarsh Karbhari
Acing AI
3 min readJan 29, 2019

--

Spotify has become a household name in the US and in most parts of the world. Spotify is available in 78 markets globally and has over 40 Million sound tracks on their platform. With all this data associated with their gigantic platform, Spotify’s data is a dream for any Data Scientist. Spotify strives to be entirely data driven. Data Science is actually not just an additional team within Spotify, it is part of the core team which drives the product. Decisions like what you should listen next or what you would like from your friends playlists are a few questions which Spotify’s Data Science team answers.

Interview Process

The first part of the interview consists of a technical screen and a hiring manager interview. Technical screen will consists of coding, SQL and basic data science interview questions. There may be a seven day take home test after this interview. This is followed by onsite interview. The onsite interview process starts with a Data Science assignment which has to be presented at the onsite interview. This is a great way to standout and showcase a Data Scientist’s eye for a Dataset. The presentation is done in front of different business, product and data team members followed by a healthy Q&A about the presentation and the assignment.

Important Reading

Source: Spotify’s Event Delivery

Spotify is build on Google Cloud Infrastructure. Google Cloud has superior ML capabilities which is a natural fit for Spotify. They have build several libraries which they have open sourced.

  • Easier Chart Creation for Python: Chartify(Open Source)
  • Complex pipelines of batch jobs with Hadoop support: Luigi
  • Spotify’s tool for testing based on finite state machines: Graphwalker

AI/Data Science Related Questions

  • How to estimate the parameters in a uniform distribution for a given data set?
  • How would you leverage Spotify’s data? (Something apart from what Spotify already does)
  • Given n samples from a uniform distribution [0, d], how to estimate d?
  • Given a sample set of tables, write a sql query to get a summary metric from those tables.
  • What data would you look at to answer this question? (Giving a question based on a problem/data science assignment)
  • How can we deal with extreme values in data?
  • Write a function that gives make the cth column of the rth row of pascal’s triangle.
  • How would you design a system to hold a streaming database of billions of tweets, to allow for rapid querying?
  • How would you create a music recommendation algorithm? What data would you need?
  • How would you detect anomalous behaviour on a user account?

Reflecting on the Questions

The data science team at Spotify are great Data Science citizens of the community as they open source some libraries used by them. Data guides and drives the core product itself which is used by millions of users. A Data Scientist at Spotify is going to have an impact on how music is consumed by millions. It is very remarkable to have that kind of impact from your work. A great eye and an ear for data can surely land you in a job at the world’s largest music streaming service.

Subscribe to our Acing AI newsletter, I promise not to spam and its FREE!

Thanks for reading! 😊 If you enjoyed it, test how many times can you hit 👏 in 5 seconds. It’s great cardio for your fingers AND will help other people see the story.

The sole motivation of this blog article is to learn about Spotify and its technologies helping people to get into it. All data is sourced from online public sources. I aim to make this a living document, so any updates and suggested changes can always be included. Please provide relevant feedback.

--

--