What’s in a name? Plenty of possibilities for analytics

A. Methodology

At Loki.ai, we analyzed more than 100 million names from publicly available sources to develop algorithms that predict audience demographics and affluence based on the names of individual users.

Screenshot of one of the electoral rolls that we scraped and analysed

A1. Age Prediction

Intuitively, most Indians recognize that names like “Shubham” and “Rishabh” are younger and more modern names, while names like “Om” and “Shashi” are older names.

Age Distribution for the name “Abhishek” among adults in Delhi
Loki.ai age-index for selected names

A2. Affluence Estimation

We also overlaid electoral roll data with data scraped from property sites to discover correlations between where different communities lived, and where property and rental prices were the highest.

The most common surnames in different wards in Delhi
Average CBSE Scores for different first names
Sample outputs from our APIs

A3. Ethnicity Prediction

“Rishabh Srivastava” (my name) is highly likely to be that of a North Indian, while a name like “Ronojoy Sen” is highly likely to be that of someone who is ethnically Bengali.

A4. Gender Estimation

Cleaning the electoral rolls also gave us heaps of labeled gender data to train on, which we duly did. It was fascinating to see that there are instances where the first name is less indicative of a users’ gender than the last name. For example, “Harmanpreet Kaur” is highly likely to be female, while “Harmanpreet Singh” is highly likely to be male.

B. Applications

We built these tools mostly as a fun learning project, but have since learnt that they have a number of commercially useful applications.

B1. Measuring audience demographics and affluence

Using names can help brands measure the demographics and affluence of their audience, as well as that of the publishers they advertise with. Similarly, publishers can use names to find out what kind of stories attract what kind of audiences.

Affluence Index of audiences of various Facebook pages from May to July 2017
HT Stories that attracted disproportionately more women from May — July 2017
HT Stories that attracted disproportionately more men from May — July 2017

B2. Measuring efficacy of customer acquisition strategies

Using online social channels for customer acquisition in India is fraught with danger. It’s hard for one to control the kind of audience they are acquiring. By the time the expected Lifetime Value of customers acquired from a campaign becomes clear, a lot of marketing dollars have already been spent.

Screenshot taken in 2017 — some of the summary statistics have changed in the meanwhile

B3. Targeting subsets of users with offers and recommendations

Name analytics can also be used by ecommerce and content websites to target specific demographcis with specific campaigns. For example, Marathi users can be targeted with campaigns linked to Ganesh Chaturthi. Similarly, Bengali users can be targeted with campaigns linked to Durga Puja.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store