Algorithms are bad at people

I had some fun with machine learning last year. Essentially I had a lot of engagement data for a group of customers and I wanted to see if I could use it to see which way they’d fall on a YES/NO decision.

It turns out I was terrible at working out why people were YES, but much better at working out that they would be NO. By knowing this I could adjust aggregate results and started getting more realistic predictions. I got some very right! I am a data scientist magician. I got some a little off. Opps?

In the end I settled on a rough range I could be reasonably confident in — but this was too large to really make any real decisions on. The vague result I was coming up was (while neat to pull from raw data) not really better than everyone else’s gut.

I still think it was cool I built a gut, but from a business point of view it’s fairly pointless. We already had a lot of those.

There’s a school of thought that says what I was doing was absolutely the right track: I just needed to gather more information. My machine gut will scale better than everyone else’s gut if I just found more firehoses to point at it.

I think this is wishful thinking. We really want machine learning to do the cool things for people-based analysis that we know it can do for more concrete subjects. So the fact that the data is almost universally non-existent, inaccessible or misleading is put to the side.

(Incidentally, I love this story about pigeons being used to process brain scans. If you ever get stuck in the past you can jump start all kind of research with a sufficiently large pigeon coup.)

There’s a famous story about Target predicting someone was pregnant from their purchase history:

“My daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?”
The manager didn’t have any idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man’s daughter and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologized and then called a few days later to apologize again.
On the phone, though, the father was somewhat abashed. “I had a talk with my daughter,” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”

Now if you think about it for a moment, this story is fairly suspect. Taboo sex, a father defending honour, a father shamed (by the MACHINE). When stories have features that make it especially shareable (sex, disgust, shame) we should be suspicious of them. If the odds say it didn’t reach you on its merits, chances are it has few merits.

And as it turns out, this story is very suspect.

Stories like this help construct the idea of data analysis as something that is closer to magic. We put the information in the box, turn it on and we know you better than you know yourself. In reality, predictive stuff is quite bad quite a lot. Here is Amazon trying to market to me:

Now I really like books. Amazon has all my card details on file. It has years of my book purchasing history. I have a device that can only display books bought from Amazon. I can click a single button and it will take my money and instantly give me a book on that device.

And Amazon is marketing the same book I don’t want to me three times.

Ever bought an oven (or similar, infrequent purchase)? Were you followed round the internet for weeks by an optimistic algorithm hoping this was just the start of your oven buying spree?

Think about the huge amount of hours invested in making that incredibly complicated process of identification, auctioning and ad display happen. Think about how stupid the outcome is.

This article on the Facebook newsfeed team (whose job is to guess what things your friends have to say you’ll find interesting) has a number of telling details, but I liked this one:

Over the past several months, the social network has been running a test in which it shows some users the top post in their news feed alongside one other, lower-ranked post, asking them to pick the one they’d prefer to read. The result? The algorithm’s rankings correspond to the user’s preferences “sometimes,” Facebook acknowledges, declining to get more specific. When they don’t match up, the company says, that points to “an area for improvement”.

Personalisation like this is a hard trap to see when you falling into it. You start off with excellent data about people’s past engagement, but the instant you start using that information to shortcut the process you damage your own information collecting system. They are no longer engaging with things you don’t show them. It’s cutting off your own legs on the grounds that if you weigh less you’ll run faster.

To avoid this you’d have to do all sorts of clever tricks to avoid self-reinforcing data, justifying the entire team of clever people forever.

But what’s the result? “Sometimes”.

This belief in algorithms is good if you’re someone like me who likes being paid to do cool things with data — but are (bad) predictive feeds and (bad) adtech suggestions actually worth building at all? How much work do you have to do (and commit to doing forever) before you get a marginally better “good enough” than you had before?

There are two assumptions that justify the creation of these systems:

  • If you have enough information, you can know things about people without asking.
  • You have enough information.

Never before in history has so much calculating power been available so cheaply — think what we could do with that if we had the right data! That we often don’t have (and can’t get) the right data is usually forgotten as an inconvenience, rather than a fatal flaw. You can’t clever your way out of not knowing enough.