Fairly Random Thoughts on Ashley Madison & the Swiftly Moving Line

This photo was super-generously licensed by Michael Vroegop and made available on Flickr under the Creative Commons Attribution 2.0 generic license. No changes were made to the photo.

AshleyMadison.com is an online personals service that enables people who want to have extramarital affairs to find each other and, presumably, to do sex things with each other. Its core value, as expressed through its branding and advertising, is discretion. One of the services it offers is that, for a fee, it will erase your account and all record of your account.

The Impact Team is an individual or group that targeted AshleyMadison and, according to Krebs on Security, downloaded the information on 37 million accounts. According to that article, the Impact Team is angry because the erase-your-account feature doesn’t really erase your account. It wants the company that controls AshleyMadison, Avid Life Media of Toronto, to shut down, or it will continue to release information on its users. Its core value appears to be really messing up Avid Life Media’s world.

This is a weird situation. A lot of lives are going to be changed if this information comes out and is publicly accessible.

This piece is just a riff, or what we used to call a “blog post.” I’m going to throw out a few things that I’ve read about and not draw any particular conclusions. Then I’m going to write about a technical idea for making databases more secured, that might have helped AshleyMadison avoid this situation. This is not even a thinkpiece, because I don’t know what I think.

If you’d like to respond, please do.

Huge database hacks are now so common that there is a secondary industry of credit and identity-fraud protection services. It’s almost a form of theater by now. The typical sequence is:

  1. A massive company fails to secure your data, and someone hacks in and “steals” (i.e. copies) that data;
  2. The breach is discovered either by internal audit or when the data is released or sold.
  3. You get a letter saying that the massive company in (1) takes security very seriously and that they are giving you free credit monitoring for a period of time.

But this is a situation that is more like:

  1. A massive company fails to secure your data;
  2. You only see your kids on weekends.

If this data is released, it will be released at an interesting time. A few things have happened lately that are worth noting.

In 2013 there was the Snowden leak, when it was revealed that the National Security Agency had been creating enormous databases tracking the world’s electronic communication. The NSA appeared to make a distinction between “data,” i.e. what was said in the phone call, and “metadata,” i.e. who called whom and when they called. The NSA kept an enormous amount of metadata around. And also a lot of things that are just “data,” like emails. It’s basically a meaningless distinction. Metadata is a social construct.

In August 2014, there was “the Fappening,” when images of nude female celebrities taken with mobile phones were obtained by hackers and released into the world. (To “fap” is slang for “masturbate.”) There was a massive range of responses to this event, and many people suggested that it was morally wrong to look at the images; Jennifer Lawrence, whose images were released, said those who looked at the images were “perpetuating a sexual offense.” The messageboard Reddit became a major distribution point for the images, although later it shut them down over copyright issues.

In November 2014, Sony had its internal servers hacked, which resulted in a number of stories about Sony’s internal politics, a great deal of celebrity gossip, a notable article about the vaginal hygiene regimen used by the head of Sony, and a huge amount of soul-searching about whether it was ethical for publishers to use the public, released hacked files in their stories. Some places said yes, some said no, and there was a great deal of talk about “the fruit of a poisoned tree”; i.e. is it ethical to use stolen, but publicly released, information in reporting? Many organizations used the Sony data in their reporting. Gawker, notably, went all-in on the Sony hack, and also produced interesting media criticism as a result.

In July 2015, Hacking Team, an Italian company that sold hacks, was hacked and its private information, which apparently had been poorly secured by weak passwords, was released. This company sells the ability to break into machines and spy on Internet services to countries like Sudan and Kazakhstan.

Also in July 2015, Gawker published an article about an executive at Condé Nast; this male executive tried to enter into a contract with a male sex worker, and for complex reasons that deal did not work out and the sex worker went public with the relationship. After much criticism about the ethics of outing, the newsworthiness of the subject, and the trustworthiness of the source, Gawker’s publisher, Nick Denton, pulled the story. Later, Tommy Craggs, Gawker Media’s Executive Editor, and Max Read, Gawker’s editor, resigned in protest.

Nick Denton’s note pointed out that the article was pulled not because of factual issues but because of changing mores:

We are proud of running stories that others shy away from, often to preserve relationships or access. But the line has moved.

“The line has moved” is a really intense thing to say when you are in the business of public information disclosure. The issue was not that the story was wrong, but that, as Denton said:

The point of this story was not in my view sufficient to offset the embarrassment to the subject and his family.

Hacked information released into the public is fair game for some publishers; naked pictures were fair game for giant messageboards like Reddit, but pulled for copyright (a little like Al Capone getting busted for tax evasion). A tip about a media executive who texted a sex worker results in a pulled story, but now, this morning, we learn there could be 35 million stories exactly like the one that was pulled.

I’m curious to see how news organizations react to those millions of potential affairs, should they come to light. But then again the reality is you don’t really need to “do journalism” with that data. You could automatically generate stories from the data and publish it, and let Google searches take care of the rest. Just write a program that creates salacious narratives using AshleyMadison data.

John Doe of 100 Mulberry Lane must have had enough of his marriage because on 10:30PM on November 15, 2014, he joined AshleyMadison and he had some very hot and heavy interactions with a number of women, including, Jane Doe and Susan Doe.

That’s the natural endgame on events like this — to just cut out the journalistic middleman and turn the scandal into new content. Hack ‘em all and let Google sort ‘em out.

A few years ago I was at a small technology conference and people were talking about privacy, and wondering whether privacy really matters or not. And the worst-case scenario that people kept bringing up was “being caught having an affair.” Except for one woman in the back of the room who finally got angry and said, “guys, I have a stalker.” And proceeded to talk about how an unhinged person had repeatedly violated her privacy and threatened her partner and herself, for years.

The point there is that for some reason related to modern human social dynamics, we tend to pay special attention to (especially men’s) sexual privacy and limited attention to other kinds of privacy. And so things that violate men’s sexual privacy are tremendously newsworthy. Other things not as much.

Thirty-five million people is a pretty big line to move. That’s 9.78 percent of the population of the United States and Canada (a number that seems awfully high, but who am I to doubt the word of the hackers). If we find out that ~10% of North America has signed up with a credit card to digitally do its dirt, that’s a lot of lines moving at once. Marriage counselors are about to make bank.

Since large companies create central points of access, privacy advocates often advocate for decentralization. Citibank, Facebook, and Google are centralized; the Web and bittorrent and email were decentralized. Instead of going to one big virtualized mega-machine at Facebook.com, you would bounce around a big open web. This is a very long and difficult discussion. Decentralized and anonymized services are great and much better about preserving human autonomy and protecting freedoms, but also absolutely used for doing dirt (cf. Silk Road) and for sharing illegal materials like child pornography. Centralized services are incredibly convenient and generate great riches for the people who create them, but are absolutely at risk for every kind of hacking attack, and primed for interference by meddling governments (and not just our own government), and, basically, do a terrible job of keeping users’ private data private.

No one should offer up their private information to a giant centralized service that helps them achieve secret sex goals. But tens of millions of people apparently did. Because they were told, and believed, that their information would be handled securely. As a result, there are a lot of dudes looking in the mirror today and practicing the words, “I was just curious! I was just poking around!”

More than a decade ago I read a very solid, short book called Translucent Databases. It’s about avoiding exactly the sort of situation that Ashley Madison finds itself in. I wish more people would read it.

The basic idea of the book is not revolutionary nor does it require any advanced understanding of cryptography. The book simply describes a set of best practices for encrypting personal information inside of databases so that, even if someone downloads everything in your database, all they have is a big mess of encrypted data. This doesn’t mean you can’t know anything at all about the data in the database. You can do all kinds of data analysis; you just won’t have any private information. You let people control their own passwords and you never even know their names, addresses, or any other information; instead, you use hashes to identify them. Here’s a description of how it works, from a 2002 article by Simson Garfinkel published by O’Reilly:

In Translucent Databases, Wayner extends this concept of hashing in new and important ways. For example, what if a police department needs to build a database of sexual-assault victims that lets them identify trends but hides personal information? You could use a translucent database where the first column is the hash of the victim’s name, and the second column is a hash of their full address, and the third column is a hash of their block and street. You can now group incidents together by grouping entries with identical block hashes; you can see if the incidents refer to the same person by checking to see if those hashes are different.

If you look at this article on Bloom filters, published here on Medium a few days ago, you’ll learn what hashing is and how it works. It’s not a crazy subject. The tools to do all this stuff are basically at hand. So why don’t more people do things this way? You’d think they would. It’s not expensive! It just takes planning.

Hashing is not a silver bullet; nothing is. People on Twitter have been taking me to task — for example:

I’ll be honest, I don’t know! It’s obviously hard to do things right and Wayner’s book (I’m going on memory and don’t have it on hand) takes this into account. But done carefully, that strategy would have made it far more difficult for hackers to extract tons and tons of sensitive information from the Ashley Madison database.

I’ve never built a translucent database-driven system because none of my clients have ever been the least bit interested. They want names, addresses, credit cards, and the like. But they don’t actually need a lot of that data to build a good web service. They need it for potential marketing purposes.

It’s not the users’ fault. Keeping your life secure is basically a part-time job at this point and even those of us who take it seriously are screwed in the long run. The whole infrastructure of web-service privacy is broken, as broken as the broken record that keeps saying “the whole infrastructure of web-service privacy is broken” over and over again.

So if getting hacked and doxxed is inevitable, and if the sex workers with whom we contract will leak our identities to Gawker, and if the line is moving so fast that it’s kind of…quivering…and if we share something private about ourselves, it could end up belonging to the world, and if access to our Skype calls can be sold for profit to the Sudanese government or what have you, then giving up any information that could materially harm your life, your marriage, or your career is basically insane, but we do it again and again because we don’t really have a choice. We do it because we want to put our money somewhere, participate in culture, share recipes, and have sex. We’re going to keep doing it. But we don’t actually have to be completely insecure about it. Nor do we have to go to a fully decentralized model. We could demand that large centralized services encrypt our stuff at the database level, and know that while there are still points of failure, one password won’t unlock tens of millions of others by default — which is how it works now.

In the meantime, people who make big web services can’t actually claim that they protect their users. They could do so, and give up some of their ability to market to those users, and it would serve everyone’s best interests in the long-run. But the users don’t know there are better ways, the advertisers won’t like it, and no one likes to change how they do things. It sucks, but it’s also who we are, and it’s the web we — the people who make web services — choose to build.