I gave my Tinder data to a professional data analyst

Published in

The Startup

13 min readFeb 20, 2020

What I discovered from analysing 4,046 variables of data collected from 5 years of Tinder dating.

January 2014. I’m 21, a student in my final year at UAL and living in a basement studio flat in Camden. Bored one night, I decided to download Tinder and see what all the fuss was about. I was a bit late to the party; Tinder launched in 2012 and I had watched it grow in popularity without feeling the need to throw my hat into the ring. I put together a profile and matched with someone fairly quickly — he shot me a perfectly polite “Hello there :)” …and I ignored it completely (sorry, Joe).

January 2019. I’m 26, a Senior UX Designer living in a one-bed flat in Croydon. A lot has changed in 5 years — 3 major relationships, numerous first dates and hundreds of matches on Tinder. I had also racked up an impressive amount of dating horror stories; from a guy sending me a photo of him crying when I broke it off after the second date, to a guy throwing a temper tantrum because I wasn’t charmed by him boasting about his “enormous penis”. I had become that girl with the terrible dating record, but at least it made for hilarious conversations at work. Here I realised — at my fingertips I had 5 years’ worth of data that, when collated could reveal key insights into my Tinder usage. At the very least I was intrigued to see my matches quantified. Yeah, sex is fun and everything, but so is data right?

So, how did I start this thing?

I found out that under GDPR regulations you can request Tinder send you a copy of your data, and this is where I thought to start. I wasn’t entirely sure what I would be looking at or how in-depth the data would be.

The data they sent through wasn’t actually all that helpful, but it did allow me to see — accurately — the number of matches I had on Tinder, which I was interested to see was a lot higher than the number of matches displayed in the actual app. According to the data sent by Tinder I had matched with 405 people, but over the years I worked out to have lost 157 of them. There were a few reasons for this; one of us had unmatched the other, or they had deleted or deactivated their Tinder account. I had previously removed matches if the interaction had been significantly negative, and a handful of people I had met and dated from the app were also now completely absent. These were totally reasonable decisions [made by either of us] at the time but extremely annoying from an entirely selfish data analysis perspective.

I was left with 248 matches. At this point, I should mention that I’m also attracted to women, but my male Tinder matches significantly outnumber my female ones. This is likely because I’ve always been anxious about exploring that side of my sexuality to the same extent. It was because of this I decided to not include female matches in this dataset, for now. I also decided to only collect data from matches that I had some level of interactivity with, regardless of if the conversation continued after the very first message. All things considered, I was left with 119 matches to comb through.

Considering the variables

I listed out all the variables that I thought would display the best range of information about my matches. I hoped that these would lead to the most interesting insights. These variables were:

Basic user information

When we matched (year)
Their first initial (so I could keep track)
Age (at the time of me collecting the data)
Job title/occupation
Nationality (if mentioned)

Profile specific information

How many photos they had
How long their bio was (# of words)
If they had their socials connected (Instagram, Spotify)
If they had an anthem (and if so, what genre)
If they had their education listed in their profile (yes/no)
If they were “looking for” something (because a huge amount of users really want to rob a bank with someone)

Appearance:

Eye colour
Glasses (yes/no)
Hair colour
Hair length / style
Facial hairstyle
Notable modifications (tattoos, piercings, etc)

Our interactions

The first and last message
Who started the interaction (me/them)
How long did it last (# of messages)
Who ended the interaction (me/them)
If we exchanged numbers (yes/no)
If we went on a date (yes/no)

I also decided to record some keyword-based assumptions I had from looking at their Tinder profile.

Assumptions

Based on their style
Based on their personality traits
Based on their likes or interests

Lastly, I concluded the dataset by giving each match three ratings so that I could identify the feelings I had towards that individual.

Ratings

An overall match rating (how well I thought we would get on based on what I could gather about their lifestyle, personality, and interests)
An overall personality rating (how attractive I found their personality only)
A rating for attractiveness (simply put, how attracted I was to them based entirely on their appearance) Yes, this one made me feel like an absolute arsehole.

Considering the discrepancies

Early on I recognised some discrepancies in the information being collected.

A match from 2014 was very unlikely to have the same photos, profile, and information now as when I matched with them 5 years ago. Which meant you couldn’t say for sure what prompted me to swipe right on them in the first place. Equivalently, interests and appearances can certainly and sometimes drastically change over that period of time. This prompted me to add a column to my dataset to record if I would swipe right on them (yes/no) again.

I also realised that some of my matches were simply just friends that I had matched with for a bit of banter. Some matches were also made by my friends who have, on many occasions, taken over my Tinder account for a laugh (yes I’m looking at you, Scott). Whilst I had unmatched most of them, some still remained and weren’t users I would’ve swiped right on. Because of these outliers, I added another column to note if a match should (yes) or shouldn’t (no) be included in the findings.

Enlisting the help of a professional analyst

I trialed collecting the data of 10 matches before I brought in the big guns, a friend of mine, data god Faizal Kassam. Faz is the Head Of Marketing Operations, Analytics, and Technology at Cirium and loves data so much that he jumped at the chance to help. He started by telling me what kind of things I should be aware of when collecting data. There were a lot of things I never considered; using a rating of 1–5 is better than using a rating of 1–10, for example. When marking down keyword-based data, separate the keywords with a comma and don’t note down more than a couple for each individual. Faz mentioned here that the aim was to not overload the dataset.

The monotonous reality of combing through each of the 119 matches to record their data in a spreadsheet dawned on me after I jotted down the first 10 and realised how long it took. I now understand why normal people don’t normally do this kind of thing. It took until June 2019 for me to finish collating 34 variables of data on 119 matches. That’s 4,046 pieces of information for us to play with.

Findings: The four models

Faz used the data to build four models that would not only allow us to observe insights that I would never have been able to see otherwise; you could also use these models to predict the match success of future profiles.

The first model, and perhaps the most significant, was the Master model. This combines all the data to see what’s the most predictive. Faz discovered that for me, the most important variable when swiping through Tinder was actually style. Anything “alternative”, “stylish” or highlights the user had a sense of individuality immediately piqued my interest. He also found that if the user also had a job in the creative or music industry then the match percentage is a ridiculous 91%.

If the user had a typically “corporate” job or their job was unknown, then their match rating would significantly depend on the number of words in their Tinder bio; 73% if their bio had 10 words or more vs 14% if their bio had 10 words or less. Faz suggested that this probably meant I was receptive to a particular trait in users who wrote more than those who wrote less. I agreed, I always seem to respond better to matches who write more information about themselves as they seem more open and interesting.

If their style was considered “average” or even “professional” the match rating was dependant on how attractive I found them; a user with a 4/5 rating or higher was 80% vs 14% for users ranked 3/5 or lower.

Before sending these results through to me Faz sent me a message asking me what I felt was more important to me in a match, attractiveness or personality? I didn’t even hesitate. “Personality.” I said “100%.”

**The Attractiveness vs Personality Model**

“Mate, don’t lie to me, I know all your secrets”

“No, really! Personality is definitely more important to me, I swear! Do you really think — ”

“ — The data shows that attractiveness is 4X more important to you than personality”

Well, shit.

This he found in the Attractiveness vs Personality model; which only considers attractiveness and personality ratings to understand which is more important. If the user had a rating of 4/5 or higher, it was an 86% match. If they had a rating of 3/5 or lower but had a 4/5 or higher personality, it’s was 67%. A rating of 3/5 or lower on both attractiveness and personality resulted in only a 19% match.

What was really interesting was that in the Preference model he observed a strong correlation between tattoos and how attractive I found a user. The Preference model also revealed I had a strong preference for blue eyes (73% vs 36%). All other variables remained the same.

The last model was the Profile specific model. Which assumed I had no preferences and instead focussed entirely on what I’m attracted to in a user’s profile. Again, we saw the same correlation between users who had longer Tinder bios and higher match percentages. 30 or more words were a 69% match. If a user had a bio of fewer than 30 words but had between 4 and 6 photos it was 52% vs 25% if they had any other amount.

Findings: Drilling into the data

I decided to isolate all users in the dataset with a 4/5 attractive rating or higher to see if there were any other preferences that I was significantly attracted to. This whittled my 119 person dataset down to only 37. I then analysed the physical traits of those 37 people and found the following results. Obviously, a lot of the results indicated what Faz had already told me; that I had a strong preference for blue eyes (35% of users with an attractive rating of 4/5 or higher had blue eyes) and tattoos (40%). Only 28% of users had no modifications at all.

Most frequent physical attributes of matches with a 4/5 or higher attractiveness rating: Eyes and Tattoos

Most frequent physical attributes of matches with a 4/5 or higher attractiveness rating: Hair - Colour and Length

Most frequent physical attributes of matches with a 4/5 or higher attractiveness rating. Facial Hair.

There was also an overwhelming preference for short beards (73%) over other facial hair types. I defined “short beards” as any facial hair that’s more than stubble but less than your typical lumberjack. Brunettes were the most preferred hair colour and long hair was the most preffered hair length (long haired brunettes were the strongest preference at 38%). I also noted an almost equal preference for “styled hair” — in short hair this was defined as any haircut with a distinctive style. In long hair this defined users with dreadlocks.

When relaying these results back to my colleagues, one of them piped up with something I really hadn’t noticed before:

“From what you’ve described… you do realise you’re just attracted to yourself with a beard, right?”

Oh, brilliant. Cheers Sean. Around 50 hours of work and all I’m really left with is the earth-shattering, horrible realisation that I’m so narcissistic I’m mostly attracted to men that vaguely resemble myself.

Jokes aside, It prompted me to do a bit of research on this theory and I came across some studies that highlighted how human beings tend to be attracted to similarity, both in friendships and relationships. Without making this entirely Freudian, I found this quote in a Medical Daily article on the topic:

“This preference for likeness is so strong, that we even tend to choose partners who physically resemble ourselves or our parents”.

Yikes.

It wasn’t just in physical preferences where I observed this trend. Looking at the personality assumptions I had made on matches with a personality rating of 4/5 or higher (34 people), the list was almost entirely made up of attributes that I always wanted people to describe me as having. OK, maybe not ‘cynical’, but I do want people to think of me as witty (assumed in 38% of matches), adventurous and sociable (both 26%) fun (23%), creative (20%) and kind (17%).

The most frequently **assumed** traits of matches with a 4/5 or higher personality rating

No comment on the “kinky” trait.

I also isolated all users with a match rating of 4/5 or higher and this left me with 40 users. It wasn’t at all surprising to me that music was the most commonly assumed interest, with over 57% of matches indicating that this could be or was a key interest of theirs. Music is a huge part of my life; despite the fact that I can barely get out a single tune on a ukuele. So it made sense that I would be attracted to people who shared this interest. Travel and alcohol (worrying?) were also frequently assumed interests of users with a high match rating.

So basically, I have an attraction to touring musicians who like a drink?

Most frequent interests assumed in matches with a 4/5 or higher match rating.

Speaking of musicians, it was previously observed in both the Master model and Preference model that I was strongly attracted to users that worked in either the creative or music industry. I wanted to see what the breakdown actually looked like. Out of the 40 users who all had a match rating of 4/5 or higher, 40% worked in the creative industry. The most common occupation in that industry being “designers” (yes, this included one UX designer!), there were also a healthy number of directors, producers, and photographers. The most common occupation across all matches were “musicians”.

Of course it was.

The occupations of matches with a 4/5 or higher match rating. Organised by industry.

Findings: Looking at the trends

Finally, Faz and I looked at the differences between how I used Tinder in 2014 vs how I used it 5 years later. Again, I wasn’t surprised to see that as time went on the percentage of users I would swipe right on again increased (From 18% to 62%). What was also interesting was the number of users with an ‘alternative style’ also increased (from 32% to 62%) which may suggest that I’ve become more attuned to what I’m attracted to in the last 5 years.

Final thoughts

This overly self-indulgent experiment was never really about me trying to find out why I’ve had such awful dating luck, nor was it about trying to find a “perfect match”. If I’m really honest, I just thought it would be fun, maybe interesting, but probably only if you’re me. I did discover some neat little insights though:

The data overwhelmingly indicated that the most important variable to me is “style” (more accurately, an “alternative” or “individual” style). Pairing this variable with a job in the creative or music industry results in the highest possible match rating.
The data may also indicate and support the theory that I‘m attracted to self-similarity when choosing a match, not only in physicality but also in interests, career, and values.
The data highlighted obvious physical preferences. However, I would argue that these only further the first two insights. I may have a preference for blue eyes only because I have blue eyes myself. I may have a preference for tattoos only because they usually identify an “alternative” or “individual” style.
Tattoos and attractiveness are highly correlated, which may suggest that I base physical attractiveness largely on style and individuality.
I tend to be more attracted to the profiles of users who write more words in their Tinder bio.
The data indicates that in the last 5 years I’ve become more attuned to what I’m attracted to and this attraction has become more specific over time.
And finally, musicians are my weakness. I am shook.

However, these insights are mostly just indicative of an initial and quite superficial attraction. The kind you have on dating apps or even in a bar or walking down the street. I’ve fallen in love with and had plenty of incredible relationships with people that didn’t possess a majority of these “preferred attributes”. I’ve also had extremely toxic relationships with people that — by these results — should’ve been “a perfect match”.

What I’m trying to say is, these results will never be a true indication of the success of a connection over real, human interaction because a genuine connection is not something that can be measured from a spreadsheet.

Burning questions? Shoot me a message on Instagram, LinkedIn or contact me through www.mollyindia.co.uk

*Data taken from matches obtained between January 2014 and February 2019

*A massive thanks and huge credit to Faizal Kassam for all his help with this experiment. For anyone wondering, yes he did run himself through his own model and conveniently came out as a 73% match.