What is being posted on /r/progresspics? An initial analysis.

An initial analysis of demographics and results of users posting on the /r/progresspics sub on www.reddit.com.


Initial data access: python, reddit API via PRAW; Data analysis: python, pandas, numpy; Visualisation: seaborn

Data source and sample size

Data was extracted from reddit on 25th Feb 2017. The necessary data (demographics, weights, time taken) is generally posted in a standard format in the titles on the sub; e.g.:

A custom algorithm was written to pull data from the top 1000 ‘hot’ posts. The algorithm is fairly good at making sense of what data is available, but discards any post headings which lack critical data or units, or deviate too far from the usual posting format, to maximise cleanliness of the dataset.

A little cleaning was done in Pandas, to remove a couple of outliers which slipped through the algorithm. This yielded a dataset with 777 entries; if those without time data (required for time and rate analyses) were also removed, 533 were left.

Basic demographics

381 posts (49%) were from women; 396 posts (51%) were from men. The vast majority of the dataset (709; 91%) posted weight loss, while the rest posted weight gain. (Defined simply: weight-losers posted a weight change ≤0kg; weight-gainers posted a weight change >0kg.)

Age distributions, split by gender and loser (top)/gainer(bottom) status. Looks like a fairly standard distribution, albeit with an interesting ‘double peak’. As expected, male weight-losers start off heavier (median 111.6kg) than women (89.5kg).

The time-period which people report on in their posts, is generally within the 3 year range, with a scattering of outliers:

If we look only at the first 18 months, we can see the monthly pattern (note that most people make posts with months as their time unit; very few post weeks) with people particularly motivated to post on anniversaries: 6 and 12 months are the biggest spikes for both genders on this plot.

Starting weight distributions, split by gender and loser (top)/gainer(bottom) status. Unsurprisingly, weight-losers start off heavier.

How quickly do people lose weight?

The distributions for the rate of weight loss appear similar; men have a slightly higher median rate of weight loss (3.03kg/month) than women (2.27kg/month).

However, remember that men start off considerably heavier. If we normalise the rate of weight loss by dividing each individual’s rate by their starting weight, we see very similar distributions. In this normalised distribution (not currently shown due to a very odd error with Medium) the medians are almost identical for women (.0240 [kg/month]/kg) and men (.0245 [kg/month]/kg).

If we look at weight-losers, there appears to be a correlation between a higher starting weight, and losing weight more quickly. This correlation is very consistent between genders.

Effect of age?

There appears to be a possible trend towards older people posting about slightly greater weight loss:

However, the rate at which people lose weight seems to be independent of age: