Stats and the City

Maya Shalev Gotlieb
Simply
Published in
6 min readDec 16, 2021

How This Article Got Me Working on a Saturday, and the Connection Between Statistics and User Behaviour

A few weeks ago, Spotify launched a great new website pairing top charts in localized markets, drilling down on song/artist/album… Soon after that, I came across this TimeOut Tel-Aviv article, claiming that Tel-Aviv is not as special as we all thought, based on the fact that the city’s chart is kind of similar to the national one. I guess what got to me in this article was the way they managed to touch on a combination of 3 of my favourite things — music, Tel-Aviv, and of course, data.

Let me explain. As a born and raised Tel-Aviv millennial, I was always proud of my city’s international flare, especially when it comes to culture. My friends and I always explored new kinds of music and were the first to go to concerts at Park Hayarkon. That’s part of the reason I found myself working as a data lead at Simply, a musical startup that changes the way we learn and play music. Part of my job is to help decide which content should be added to the app’s courses and library, based on the best data points available, which is why I’m accustomed to making conclusions about our learner’s taste in music. So you can imagine my frustration when I read TimeOut’s article — if you’re gonna take a shot at my city, at least get the numbers right. Now, I’m not saying Tel Aviv is, in fact, more special than any other place, but this article surely isn’t doing much to persuade me to think otherwise.

To me, Spotify Charts is a very useful tool to discover songs and artists from around the globe, but it can’t really be used to characterize and compare musical tastes of Spotify’s listeners.

It’s all about proportion

Spotify’s score ranks the songs on the number of streams, which in statistical terms is called the mode (the most frequent value in a dataset). In fact, the mode is the only measure of central tendency that you can use with categorical data, such as song names. But as we know, a central tendency alone is not enough to describe a population, let alone compare it to another one. Here’s an example:

Let’s say we want to understand whether two teams of 10 people in an office have similar taste in food, based on their lunch order. In one team, everyone ordered pizza together, while the second team had three people share a pizza and the other seven each ordered something different. So you can say that each team probably has people who like pizza in it, but the variability in preference is larger in the second group. In fact, if we want to compare two sets of categorical data, we need to look at the proportion of values — the number of samples in the mode value, divided by the entire sample size (in our example — 10/10 vs. 3/10). Unfortunately, Spotify only reveals the number of streams in the global charts, so we can’t actually compare the diversity between a country and a big city.

Sometimes, dispersion is key. Over-representation of black people in US incarceration facilities: the ranking of racial frequency is the same for US population and prisons, but the proportion of black prisoners is much higher than their share of the general population.

Recency and seasonality

Spotify’s chart ranks the top streamed songs from the past week, which increases the weight of recency and seasonality compared to other effects. There are plenty of variables that influence musical taste, such as age or the decade we were born in, gender, country etc. But another thing that can influence our taste is simply timing, or what’s on the radio now. Unlike the other variables I listed, the nature of trends is to influence people at a certain point in time. That’s why we can all look back at 2012, the year we all did the Gangnam Style dance with nostalgia (and a bit of shame), and dismiss it as a passing phase.

At Simply, when working on adding new content to our apps’ courses and song library, we also try to rely on our learners’ musical preferences. However, looking at a narrow range of data might be very misleading for this purpose. If we only look at data from last week, we will be biased by recently released songs (if we’re opportunistic and fast, we can add these songs quickly and enjoy the benefit of their popularity). If we choose to look at the month of December, we might think that holiday songs are extremely popular. Therefore, to capture our audience’s musical taste, we generally try to look at a relatively large timeframe to avoid these biases.

Another thing we might be missing in the way we are looking at the chart is the representation of different segments. In general, pop music could be a good representative of the largest segment in the population, But what about those who don’t listen to pop music at all? For example, in the United States there’s a large segment of Christian Rock fans, many of which don’t listen to mainstream pop music, but their favourite songs will not usually appear at the top of the national charts. So, to get a better representation of musical taste, we might want to dive deeper into segmentation and we might find colourful scenes of distinct musical preferences. On our apps, Simply Piano and Simply Guitar, apart from the top popular songs, we do our best to include songs that would cater to different people with different tastes, so they could have the best possible learning and playing experience.

And while we’re on the topic of our song library, here’s another dimension that helps us understand which songs are very well known, and which are the ones people really like — a high number of replays per person can indicate that people enjoy listening to the song. On the other hand, it can also make a song seem more widespread than it really is, when in fact there is one segment that listens to it repeatedly (did someone say “Baby Shark”?).

Finally, the number of replays is just one way to research users’ intent. There are other ways to figure out what users really want to listen to: we can look at the songs people are actively searching for, the ones they are clicking, or filter out the ones they are skipping when Spotify plays them automatically (by the way, the Spotify player probably tends to play popular songs after the ones you originally chose, so the bias of counting streams might be increasing itself).

All of these metrics and others can help us characterize our learners’ musical taste and make good decisions when it comes to the content we use on our apps.

So what can we learn from this article?

First, pop music is popular — In Tel-Aviv, in Israel and probably everywhere else. Popular music is, by definition, the one that many people listen to. But a top ten rank can’t tell us the full story of a population’s musical taste.

Second, if you compare two groups just to say that they’re not special, you should do it with accurate data. And if you have no intention to clear all the biases out of the way, at least make sure you have a dispersion measurement along with the central tendency.

And lastly, understanding users’ behaviour in order to improve your product is challenging! If you like making sense of data, fighting biases and making it actionable for everyone to rely on in their decision making, you belong with us! check out our open positions at Simply.

--

--