Lookalikes : Birds of a feather flock together

Anuja Korade
Evolve You
Published in
4 min readOct 22, 2020

Ever wondered how companies like Amazon and Netflix know precisely what you want?

No, they don’t employ psychics. They use something far more magical — Mathematics!

Traditional systems discover overlap audiences which are Quantitatively similar.Let’s say, Alice and Bob have the same age hence, they are considered similar, But what if they are demographically dissimilar (i.e they have different ages) yet share common interests?

These systems fail while mapping Qualitatively similar audiences. Intuitively, Alice and Bob could have different ages yet they could share a common affinity towards sports and fitness like two peas in a pod.

The heuristic algorithms map users based on their behavioural patterns rather than simply mapping users on their attributes.

Quaero’s look-alike system is based on Matrix factorisation and pairwise user-2-user similarities.

The core principle of this model is user segment approximation, where user segments can be user characteristics such as user interest categories. For example, if a user likes NBA games, this user is considered to belong to the Sports segment.

This model supports many business objectives like targeting users who fall in a common category of past purchasers, web service subscribers, installers of particular apps, brand lovers, ad clickers, supporters for a politician, fans of a sports team, etc.

The goal here is to address a statistical audience and find the look-alikeness of segments. The model assigns a similarity index to the audiences. The dataset consists of different sorts of user behavior, such as purchase history, watching habits and browsing activity. Marketing data is 99% sparse and high-dimensionality is a common challenge in processing data from large complex systems.

Here, Singular Value Decomposition is used as a Collaborative filtering technique.The whole point of a dimension reduction model is to mathematically represent the data in a simpler form. It’s as if we took a very high-resolution photograph, resized it to be smaller, and then deleted the original.

Benefits of Singular Value Decomposition:

Eliminates collinearity : Let’s look at the monthly salary and annual salary of an employee. Annual salary is 12 times the monthly salary. This is just redundant information. SVD eliminates redundancy and captures the salient features of the data.

Reduction in computational cost : By getting rid of redundancy SVD reduces computational cost.

Overcomes the curse of dimensionality: We are all busy people with places to go and folks to see. We want our information quick and to the point.That is the essence of dimensionality reduction. When confronted with a ton of data, we can use dimensionality reduction algorithms to make the data “get to the point”.

Finds “natural patterns” in data : Let’s say Alice and Bob have similar interests in action movies. Based on their ratings for different movies, the system learns that Alice and Bob share similar tastes.

Having found the latent patterns in data we wanted to associate a value with them. A straightforward approach is to compare all pairs of seed users and available users in the system, then determine look-alikeness based on distance measurements. A simple similarity-based look-alike system can use direct user-2-user similarity to group similar profiles.

Does this approach work?

We assessed our model on different segments/audiences by conducting tests on segments with an affinity towards politics, segments with similar purchase history. Our system provides Lookalike scores which signify that people in the sports and fitness segment are X times more likely than the general population to be in the market for Sports apparel. And then helps find a Lookalike audience segment from the universe dataset.

The results of the tests prove our method can achieve more than 50% lift in targeting over other existing audience extension models.

“The size of your audience doesn’t matter.

What’s important is that your audience is listening”

-Randy Pausch

Lookalike modelling is one of the most unique and compelling ways to extend your customer base and find new people who are similar to your existing customers. To wrap up the scenario, this is how companies like Google, Facebook, LinkedIn, Amazon, Instagram, Snapchat reach out to you because you matter the most to them ! :D

I hope you enjoyed this article. Feel free to leave a comment or share this post. Follow me for future posts.

--

--

Anuja Korade
Evolve You

Data Scientist | Product Evangelist | Innovation