OkCupid’s Data Just Doesn’t Match

A Response to OkTrend’s Conclusions on User Behavior

Asher Snyder
6 min readJul 29, 2014

As a long-time member of OkCupid, and recent co-founder of Mesh, it was very exciting to see that after three years of radio silence OkCupid’s popular blog OkTrends was back with it’s newest entry, and I was truly hoping for some insightful conclusions. However, on closer examination, it’s easy to see that all the experiments were deeply flawed leading to faulty conclusions. Worse, reputable publications simply re-purposed the content without looking a little more deeply into Rudder’s experiments and questioning whether or not he actually proved what he claimed to prove.

Let’s take a look and break down each experiment:

Experiment 1: Love is blind, or should be

In this experiment, Rudder celebrated the launch of their new app CrazyBlindDate, a drastically different approach that didn’t use pictures at all, by removing OkCupid’s pictures for 7 hours. They dubbed this Tuesday “Love is Blind Day” and reported that during those 7 hours:

  • people responded to first messages 44% more often
  • conversations went deeper
  • contact details were exchanged more quickly
  • overall OkCupid worked better

This is very misleading. In fact, OkCupid ONLY worked better for those who self-selected themselves to be on a dating website with no photos. Which as they showed, reduced their site metrics significantly, leaving them with only 16% of their typical usage.

So this essentially voids the experiment for 84% of OkCupid’s users. Only those who didn’t care as much about photos were left to use the site, which we might expect to lead to the above results.

Experiment 2: What’s a picture worth?

Rudder doesn’t acknowledge the problems in their interface. Many books have been written on interaction design. My most favorite being “The Design of Everyday Things”, which likes to beat us over the head with the classic misused handle example. When we see a vertical handle our instinct is to pull, similarly when we see a horizontal handle our instinct is to push. Just like with handles, the same is true with a website’s interface.

As a longtime member of OkCupid, I remember this screen. It was during the rapid-fire Quickmatch portion where one had to score both personality and looks in order to move forward. The fact of the matter is having two star ratings, one for looks and another for personality, right next to the big photo was confusing to the user. Nobody is going to take the time to read and truly reflect on personality when playing Quickmatch (which is still a problem with Quickmatch today) as the point of the game is to quickly go through profiles. So the raters, as Rudder points out were only able to measure looks. Similarly, the later example of someone with no profile information resulting in the same score for their personality as their looks makes sense given the interface. A user filling out two star raters next to a big photo isn’t going to penalize the person they just primed themeselves is a 4/5, with a lesser personality — even if their profile’s blank. You’re more focused on getting on with it and moving forward. The data would be more useful if they included “can’t tell”, which was excluded as chart only includes data from those that selected stars in both raters.

The same is true in the opposite case, it’s cognitively dissonant to contradict yourself after rating someone lower in stars looks-wise, and suggest they have a better personality. It would make you feel uneasy. So clearly the two would align. I don’t think the experiment allows us to conclude that the profile didn’t matter, or that their personality star rating, right under looks star rating is an accurate arbiter of personality. Their further attempt to call a “rate their profile” a profile rating, when it’s clearly 5 big stars next to photos is an also unreliable measure.

The fact of the matter is that they never had a correct tool to measure personality as all their attempts violate the basic principles of design. So to tout this as a “truth”, or as some publications did saying “your profile doesn’t matter” damages online dating in general as it might make users think their profiles don’t matter, leading to less profiles filled out which leads to less information and ultimately more crapshoots.

The ability to get a sense of someone is paramount to whether or not you think you’ll have a positive date experience with them and actually have a good time, versus just going out on a date. In IAC’s (owner of OkCupid, Match, Tinder, etc.) world it’s just about getting you out on the date, leading to the issues outlined in the New York Observer’s excellent piece.

Experiment 3: The Power of Suggestion

The most egregious example is Rudder’s last example where they attempt to correlate the match percentage to “success” and “compatibility”. First, Rudder’s definition of success is flawed. Their definition of success is a four message conversation. Let’s think about that for a minute. On a dating site, what might the four messages most typically be?

1. You: Hey, I really liked ______, ________. It’s always interesting how ___________________.

2. Them: Haha, that’s true. I noticed you ___________________.

3. You: That’s funny. I think we should grab a drink sometime, free at all this week?

4. Them: That would be great, let’s do it. My # is _______________

In the above situation, which is typical of most non-vulgar dating site conversations, there is no “success”. There is only “we got them on a date”, which may possibly lead to success. So no matter what the conclusion might yield, we wouldn't be able to actually make any real assumptions about compatibility or success from the conversation.

Say we put that aside, and just go with it. In their experiment they took bad matches (30% match) and told them they were exceptionally good (90% match). Obviously the users sent more messages, in the same way that if you see a restaurant on yelp with five stars, you’re way more likely to go to it based on your trust of the yelp rating system. The same is true of OkCupid’s match %. Here’s where it gets all sorts of wrong. They took the analysis further and asked does the displayed match % cause people to actually like each other. I would stop right there, as they have no test in place to see whether or not the people actually like each other. They only know whether those people exchanged four messages, and as you can see from the above sample conversation, there’s very little that can happen when you’re primed to know someone is a good match that would prevent you from having the above conversation. Similarly, if you’re told you’re a bad match, you’re less likely to be forgiving and more likely to lose interest for someone else in your inbox that’s a higher match.

In conclusion, the above experiment is not an actual indicator of whether or not you “like” someone, rather, it’s an indicator of trust in the match percentage in getting through a basic conversation that gets to a date. Obviously once on the date, your compatibility will become all too real.

--

--

Asher Snyder

Co-founder of Mesh, http://t.co/07TECQSkbF, and Forward Thinker. Former Director @ AppNexus and Co-founder of NOLOH.