A Personalised Recommender from the BBC

Matt Crooks
BBC Data Science
Published in
9 min readSep 3, 2019

--

The BBC produces fantastic content that appeals to a mass audience such as Killing Eve and Bodyguard on BBC iPlayer, the Danger Mouse games for CBBC, and Match of the Day on the BBC Sport, to name a few. While it’s great that we can produce content that is enjoyed by so many different people it does create data science challenges around understanding the personalities and preferences of individuals. Episode 1 of Killing Eve, for example, attracted a whopping 26% of the TV audience during first transmission.

Term frequency-inverse document frequency (tf-idf) is a metric most associated with text analysis and, in particular, as a rudimentary search engine. It assigns a weight to each text document that tells us how relevant each one is to a particular search term. The success of tf-idf is largely down to the inverse document frequency part that penalises popular words in the search term. Common words such as “the” carry far less information than more niche words like “BBC”. This is how tf-idf differs from simply counting the occurrences of the search terms in each document.

Here at the BBC, we’re using tf-idf for entirely different applications: recommender systems.

By analysing how our audience interacts with our content we can infer similarity between different TV shows on BBC iPlayer, or articles on BBC News. This allows us…

--

--