Never Ask a Woman Her Age?

Testing the Veracity of a Classic Adage

Joydeep Chatterjee
The Haven
2 min readJun 7, 2023

--

Source: Pexels (Anastasiya Lobanovskaya)

A social development and social justice oriented telecommunications company known as Viamo was conducting a data analysis of the demographics of users of their Interactive Voice Response (IVR) service in the country of Mali. Unfortunately, almost half of all survey respondents refused to divulge their age and almost the same amount of that refused to divulge their gender. This investigation was conducted to determine whether there is a strong correlation between gender and reluctancy to reveal age, as is commonly believed.

from google.cloud import bigquery

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = 'viamo-datakind-19b12e3872f5.json'
bq_client = bigquery.Client()

pf = ParquetFile('Viamo_user_data_Mali.parquet')
df = pf.to_pandas()

(df.isnull().sum()/len(df)*100).sort_values(ascending=False)

The result of this code block is the following output:

age                        49.83
gender 43.39
second_fav_topic 28.03
second_fav_theme 13.01
subscriber_id 0.00
fav_topic_numeric 0.00
second_fav_theme_numeric 0.00
fav_theme_numeric 0.00
fav_topic 0.00
age_numeric 0.00
fav_theme 0.00
gender_numeric 0.00
median_call_duration 0.00
n_topics 0.00
n_themes 0.00
n_calls 0.00
second_fav_topic_numeric 0.00
dtype: float64

Testing the hypothesis whether women are more reluctant to admit age, despite the survey being anonymous:

# Percentage of men who decline to state their age:
age_shy_xy = [len(df.loc[(df.gender=='male') & (df.age.isna())]),
len(df.loc[df.gender=='male']) - len(df.loc[(df.gender=='male') & (df.age.isna())])]

# Percentage of women who decline to state their age:
age_shy_xx = [len(df.loc[(df.gender=='female') & (df.age.isna())]),
len(df.loc[df.gender=='female']) - len(df.loc[(df.gender=='female') & (df.age.isna())])]

plt.figure(figsize=(12,6))

plt.subplot(1, 2, 1)
plt.pie(age_shy_xy, colors=['#c44e52','#55a868'], labels = ["No", "Yes"], autopct='%.0f%%')
plt.title('Portion of Malinese Men Admitting Age')

plt.subplot(1, 2, 2)
plt.pie(age_shy_xx, colors=['#c44e52','#55a868'], labels = ["No", "Yes"], autopct='%.0f%%')
plt.title('Portion of Malinese Women Admitting Age')

It can be proven true, but only by a hair.

The original source code for producing the visualizations in this article can be found here.

--

--

Joydeep Chatterjee
The Haven

Creative data scientist passionate about applying machine learning to physical processes, logistics, and business activities.