Reaching for the Sky at our latest recommender systems meetup

Insights from the expert talks at our fifth RecSys London get-together

Jul 4, 2018 · 7 min read

At Bibblio we recently organised our fifth RecSys London meetup, hosted by the team at Sky. The meetup took place at their offices at Thomas More Square, offering a great view of St Katharine Docks.

The presenters for the evening were Dr. Jian Li (Principal Data Scientist at Sky), Preriit Souda (Data Science Consultant and Marketeer) and Adam Rees (Senior Research Engineer at Sky).

Read on to get the highlights of their talks, and grab their presentation slides too.

Content discovery using moods

The first talk of the evening was by Dr. Jian Li, Principal Data Scientist at Sky. He’s the manager of the data science team responsible for machine learning research, innovation and product development for Sky’s content discovery services.

Jian kicked off his presentation with a short intro. on how recommendation engines play an important role in helping customers to discover content that matches their particular interests. An interesting use case for Sky is to figure out when to promote new (but basically non-viewed) content to customers:

While they have experimented extensively with a large set of algorithms to improve accuracy, there are still lots of areas to improve. One area Jian is interested in is how to establish natural, smooth and effective communication between customers with their services. His question is how a content discovery service could respond naturally and accurately to a customer’s natural expression, or mood:

Mood has two aspects, Jian explained: firstly, it’s the natural expression of a customer’s feelings. For example “I want to watch something funny”. Secondly, it’s the kind of feeling that a piece of content can evoke, i.e. this film will make a customer laugh a lot. The question is how do we connect them.

You could hard-code moods like ‘funny’ into a movie genre, e.g. ‘comedy’, but this creates some restrictions. On the one hand it’s not true that only comedies make you laugh, on the other, a mood like ‘exciting’ can’t be translated into a single genre because lots of things can be exciting. During the project Jian wondered whether they could stop this kind of manual translation and directly rank content by how funny, exciting etc. it is. Here’s what he and his team came up with:

They started with a number of pieces of content, each associated with a number of keywords describing the nature of the content and also a list of mood labels. They built a model to learn the correlations between moods and keywords. The outcome is a set of semantic representations of moods. Every mood is represented by all keywords and each key word is scored to indicate how relevant it is to a particular moods. If a mood is not relevant, the score will be low (but there’ll still be one).

The completed content mood profiles gave the team the ability to create a large number of recommendation features. For example, a customer can query one mood, a combination of multiple moods or even adding preferred weights to the combination. For example, “I want to watch something that’s really exciting, somewhat funny and scary too.” Jian’s team is currently working on the UX so the user can query this intuitively.

Also read Get into the mood and grab Jian’s slides

Knitting data to create a brocade of strategic insights

Our second speaker was Preriit Souda. He’s a data science consultant and marketeer with a sophisticated social media analytics toolkit. He and his team have analyzed around 20+ TB(!) of post-processed social media data. He focused on how you can use metadata to deliver improved customised advertising experiences.

Preriit kicked off his presentation by asking the attendees how much can you find out about a person by looking at just one of their tweets. The answer is quite a lot:

As you can see in the above slide, there’s mention of different data linkage groups such as image mining and information connected to text, weather, location and demographic attributes. See below for the overview of Preriit’s framework. On the right hand side you’ll see other data sources you can use, which are not directly linked, but very useful thematical databases to find out more about people who are like your research subject.

Preriit is very skeptical about the effectiveness of marketing surveys, and he suggested that using these novel techniques could effectively make them redundant.

So how can you use this data to solve companies’ strategic problems? Preriit spent the last part of his presentation going through a couple of use cases. One of the use cases was a restaurant with falling sales. They used the data sources below to figure out what was going wrong:

By analyzing the data (and without having to actually ask anyone directly), they found that the restaurant was struggling to differentiate itself. There were also many questions over its premium pricing in the chatter and review data they picked up. Uncovering these two main issues allowed this client to take informed action.

The restaurant kicked off a re-branding exercise and added new, affordable, seasonal dishes to its menu. These had actually been wishes expressed by people online: using data likes this offers a very practical way to help you learn about problems and come up with solutions.

Also read Other use cases and more info on the framework in Preriit’s presentation

Exploring movie posters with neural networks

Last up was Adam Rees, Senior Research Engineer at Sky. Hwe was discussing his (awesome) side project: using neural networks to find patterns in the design of movie posters. His work on this project was inspired by a blog post he read about movie poster clichés. And there are many. I’m sure you’ll all be familiar with ‘back-to-back couple’ and ‘shot through the legs’… Adam created some very cool visualizations clustering the different types:

Even though the creativity is often questionable, Adam stressed the importance of movie posters and the role they play in performance, including on Sky’s streaming service. On their Cinema platform, the poster images are used as the main navigation as users scan for something to watch:

With a Coursera course in machine learning under his belt, and interested in finding out more about the make up of Sky’s movie poster offering, Adam decided to first have a go at clustering by similarity. He used the final hidden layer of a pre-trained VGG 19 neural network to extract the features which were then used to cluster the movie posters:

By using the ‘tools’ Jupyter (Python), Keras and Tensorflow, and the T-SNE (t-Distributed Stochastic Neighbour Embedding) algorithm, Adam could plot the images in a 2D environment. Here’s one of the beautiful first results:

Using the Neural Style Transfer technique to amplify the clustering, Adam identified groups such as posters with as focus point ‘dogs’, ‘Marilyn Monroe characters’, ‘guys with hats’ and ‘gangs’. Other major themes were described by him as ‘Gritty Crime Dramas’, ‘Romantic Boxes’ and ‘Spooky Houses’:

Adam isn’t sure where he will take the project next. An idea suggested in the Q&A was to actually team up with Jian’s research team, to discover whether there’s a relationship between the imagery and text or mood data allocated to it. Could it help in creating better recommendations or open the pathway towards a personalized movie poster?

Also read Find out more about Adam’s work with convolutional networks in his presentation

Originally published at

Written by
Robbert van der Pluijm
Head of Bibblio Labs & Events
Make the most of every visitor

Bibblio boosts engagement and revenue by suggesting the best content to your users from across your site. Visit us or follow us on Twitter and LinkedIn.

The Graph

Smart thoughts on the future of digital publishing


Written by


Posts about media, publishing, learning and better content recommendations for people.

The Graph

The Graph

Smart thoughts on the future of digital publishing

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade