Boy-o, is all of this embarrassing. People having affairs, databases getting pwned, governments griping, and also getting pwned, and a blame game that looks like the gun stand-off at the end of a Tarantino film.
We can’t know if something like translucent databases could have helped AshleyMadison.com without knowing how the hack happened. We can know that it would have helped with some hacks, and not with others. Database hashing (which is what a translucent database is) almost certainly would have helped a lot with the OPM data hack, which impacted something north of 21 million records. It wouldn’t have helped the NSA at all, since Edward Snowden walked out with material that was meant to be inside the security perimeter. On-the-fly disk encryption of NSA material might have helped, if the NSA felt they could function with more security perimeters than they did at the time. My guess is they do now.
A couple terms: hashing is of cryptography, but not really crypto itself. It doesn’t have a way to decrypt, per se. Just a way to see if the data input matches the data on file, which is how passwords are (supposed to be) stored. It’s righteous good magic for the network, and everyone should know what it is, at least conceptually.
To explain what hashing is, and why it’s so vital to creating a safer net, I’m going to quote myself:
…it is a mathematical operation one can perform over a number of any size that gives exactly the same output with the same input, but wildly different output with any change in the input. A hash is a bit like applying a math puzzle to something. Imagine this: take any number n, multiply by 12345, cross off the last three numbers, add 1, multiply by 4, dance a jig, subtract 543, drop any sign, square it, and print the first 5 numbers. Doing all that to the humble 1 gives you 24108. Doing all of that to 2 gives you 19624. You get these figures every time reliably if you put in 1 or 2, but it’s hard to figure out what number anyone started with just by looking at the final number. It is much easier for either a human or a computer to do all those things to a number than to look at the output number and figure out what number was originally used. This means that by keeping a record of hashes, we can always check if something matches what we expect, or if it’s changed. This is used all over the network, to find things, to encrypt things and keep them safe, to create much of the veritas of our network experience.
Disk encryption is using reversible math to hide your data when it’s not in use. There’s a key that will decode the data so that you can read it in plaintext. We call data, when it’s written on your hard drive or your camera’s SD or sitting there on your phone doing nothing, or sitting on Amazon’s servers doing nothing: data at rest. On-the-fly disk or database encryption only decrypts what you need to work on at the time so that if someone grabs everything, they’ve only got a little bit that’s readable. There are schemes for working on encrypted data when it’s in use but it’s kind of castles in the sky for now. Instead, we encrypt data to transmit it, then decrypt it on either end to work with it. That encrypted transmission is called: data in motion, which is mostly in the real world done with TLS/SSL. This form of security is a more solved problem, and there’s no excuse to not use it anymore, except that we have no other way of signing onto networks at airports.
Remember these terms, think of them when you use your screens: Data in Motion; Data at Rest.
The Cloud generally refers to any centralized server system storing other people’s private data. The cloud will never be as safe as decentralization, given good updating for individual machines. This means better patching, and timely applications of those patches. Good updating is a big given, though, because right now people don’t patch their computers enough to make their data very safe on them. Any individual server is probably not as secure as Google’s, but there’s a reason centralized services get robbed: that’s where the data is.
Everyone would be helped by more encryption, centralized and on the edge. Encryption gets you a lot in terms of security. It not only makes data unreadable, but tamper-resistant and verifiable as well. Why don’t we use it more? Because, Reasons.
Let’s delve into the crossover of technology and social mores that’s constantly polluting our news feeds.
1. Looking at leaked data: “Do Less Harm”
As Paul noted, it’s an interesting time for acts of journalism. Is looking at leaked data the same as endorsing the leak? That’s a personal call in many cases. The difficulties of judgement aren’t restricted even to leaked data, given how much personal data is publicly available. If you can find someone’s home address in a public database, should you put it in an article? It’s almost a silly question, the answer is so obviously no. But shy of home addresses, things get murky fast. Data, leaked or not, needs to be assessed with a journalistic discretion. Our social contract as journalists has a measure of quality to it: we don’t write stories merely to be clicked on. The difference is a conception of information integrity and public interest, which is decided, piece by piece, on a landscape of subtle grays. There are few hard and fast and universally agreeable lines on what data serves the public interest, but we can converge on some ethics.
Call it a Do Less Harm approach to information. Emphasizing abuses of power over abuses by small actors, balancing between the harm to the subjects with the needs of an informed public. Redaction gets involved here: we don’t need to know the name of individual agents to know the NSA is being abusive and make a call for accountability, but the names of Michael Haden and Keith Alexander belong in the news as leadership of the abusive agency. Source protection is often a matter of the threats to sources. If my source is somewhere where laws are sane and followed, I don’t feel as strong a need to protect, redact, and monitor as I might if someone was at risk of American or Saudi Arabian forms of punishment.
The ethics of information in the public interest has never been easy, and always been contextual. An affair by a political enemy is always more relevant than an affair by your guy. I think personal lives are only relevant insofar as they describe some kind of secondary secret social system. A legislator doing a thing that is, or that she wishes to make, illegal under the presumption that the law about that thing should not be applied to a law maker is a clear case of public interest. Universality of the rule of law requires we all be under the same law, but I literally repeat myself.
That people will read about or want to know something isn’t a case of public interest, and even the people who want to read about something know that. When you’re clicking on the celebrity gossip link, it’s a guilty pleasure, because you know this whole thing is stupid and shouldn’t be happening.
Examining leaked data such as the Snowden files, Hacking Team, and HBGary was clearly in the public interest — all disclosed powerful entities and governments lying and abusing their power. The places where those leaks exposed the idiosyncrasies of the people doing the lying and abuse are harder.
Ashley Madison is more clear cut to me as a journalist: as no one serious is trying to make adultery illegal, I don’t need to know if anyone is cheating on their spouse. Whether their spouse should be able to search a database for their partner’s name is harder to nail down. No one needs a media org to host this data any more to expose it to the public, mooting our power as amplifiers to a degree. There’s no good answer to public access to leaked data. Making searching a public database illegal on the internet is as impracticable as turning jaywalking into a felony. On the other hand, in the words of the illimitable Stewart Brand, “we are as gods and might as well get good at it.” Perhaps we don’t search the database for our beloved not because we can’t, but because we shouldn’t. Maybe if I have to be an ethical and careful journalist, you have to be an ethical and careful spouse. With great power, etc. etc. etc..
So that leaves the sorry state of security, about which I am as much a broken record as anyone else. But I see it as a frustrated process of birthing an amazing new world and new humanity, linked together and wise in ways never possible before — that whole as gods thing. To have this amazing world of soft & hardware miracles that rewire societies and make us ever more knowledgeable and powerful creatures, we’re going to need to be able to count on our devices and our internet. Until then, we are building our cathedrals out of Jenga bricks. On a network as social as ours, security is a piece of reliability.
In a way, as a security journalist interested in government and corporate accountability, I have reasons to be conflicted about better security. People like Hacking Team’s customers lose from better security if Hacking Team can’t compromise their customer’s targets. I am pretty ok with, given how often HT targets journalists, activists, and people who really shouldn’t be getting hacked by their governments. But we also lose the chance to catch them at it when they do. It’s as hard for governments to hide corruption these days as hiding cell phone nudes is for a teenager, which is giving us a rare window into the inner workings of power that usually takes decades after the fact to get. It’s a disruptive and unstable time for the autocrats and oligarchs who have flown under history’s radar, and we’re only beginning to suss out what that means. To maximize the effectiveness of this window into the inner workings of power, we must educate ourselves and each other as to how power gets abused. But to do anything about it, we need to be able to build things ourselves. If we are to build lasting institutions in a networked world, reliability must eventually trump being able to compromise bad actors.
So: Back to the Sisyphean Security Discussion
2. Any solution to the security problems we face will be multifaceted
Legal requirements, cultural education, and technical progress are going to all be part of a functional networked world, and its digital security.
On the legal front, we need many more measures like those in California’s SB 1386. This law was the start of hacking disclosures, and lead to the hilarious prospect of California residents being told their data had been leaked while no one else would hear a thing. This situation became embarrassing for everyone involved, and thusly disclosure got better for the whole world — though not yet good enough. Detection and enforcement need to come in line with the concept of disclosure. I’d wager to this day most intrusions are never detected, and most that are, never get disclosed.
We need some level of liability for software development or deployment. This is going to be a touchy and difficult process, but delaying it more doesn’t make it easier. The fact that a car maker can’t sell a car with faulty brakes but can sell a car where a remote hacker can disable the brakes entirely is insane. It is obvious that the answer to harmonizing such a situation is not going back to carmakers being allowed to sell cars with faulty brakes. As software becomes more vital infrastructure, we’re going to want it to be regulated more. No one wants fly-by-night companies contracting their dams and airplanes, and we’re all pretty happy with the idea that you can’t do heart surgery without being a doctor first. As software eats the world, it’s time legislators got more serious about software’s role in consumer safety and protection.
That’s never going to be easy with something as crazy and fantastic as the soft-hardware world of technological products and services, but then, it was never easy in the first place. But we were able to put safety and quality standards on cars, planes, drugs, medical services, the electrical grid, satellites, space programs, laboratories, factories, restaurants, construction materials, and body piercers, and we not only still have all those things, but pretty good versions of them. Good regulation would probably vastly improve banks, hedge funds, mines, deep water drilling, and of course, databases full of private information.
That none of this regulation should, as is being pushed now, require systems to become less secure with schemes like backdoors, golden keys, or magical key escrow shouldn’t even have to be said. But here we are again — if powerful people want to stop getting pwned, they’re not going to be as able to pwn other people. Nobody gets their own tech ecosystem.
We’re all also going to need to change our consumer behavior. Social media and John Oliver may not like it, but educating and cautioning people against online violations is not the same as telling women how to not get raped or blaming people for having their house broken into. It’s more akin to telling someone that if they don’t want to keep getting their bike stolen, they should stop storing it in a crackhouse.
This is how I want people to think of their online data: as stored in piles in the basement of a crackhouse run by a drug cartel for crack addicts to use. If you want to put your bike in that basement, you’re going to want some hella amazing locks on that bike.
Right now, the only thing protecting your bike in that crackhouse is the overwhelming stack of bikes on top of it.
There’s real benefits to centralized services, and they aren’t going to go away, but neither are decentralized. We all need to make conscious decisions about the services we use and the risks they entail. We already do this all over our lives, we’re educated to it, it’s part of the stream of adult life all over the world. The problem is that, while we take 12 years to teach our kids regular literacy, we generally give them about two hours of digital literacy education.
Education has always been a vital part of consumer protection, and the internet is no different. The first line for digital literacy right now needs to be our schools. Kids should be learning about networks from a young age, and the basics of how computers work. This means teachers need to learn about these things, need to make it their business, if their business is still preparing child to be functional 21st century people. From there, kids will know how to demand a better network as consumer and political actors when they grow up.
There are some serious technical problems to overcome as well. Creating secure building blocks, and emphasizing that usability is part of reliability and security for the next generation of developers. Why don’t more companies currently take advantage of encryption and hashing of their data at rest? Most of it comes down to two reasons: most programmers don’t really understand them, and they are expensive in compute time. Why don’t companies contributing to the code they use and make it better, preventing things like the multiple OpenSSL disasters? Because as institutions, they are idiots with the attention span of mayflies. Programmers, you are in a great place to demand more from your employers, to make your code something to be proud of. Yes, you need to learn more about security, but also you need help. Demand code audits, security reviews, and user testing, as well all the tests you write now. Demand them, but don’t watch them. That will drive you crazy, believe me.
That working on data without decrypting it thing would be pretty cool, but one of the reason it’s not gotten further is that no one is paying anyone to build it for them. The biggest technical hurdle turns out to be: convincing companies that they need to lose a few percent of profit to creating a quality product. This is doable, but generally gets done through some combination of consumer revolt and regulation.
Let’s all stay embarrassed for now, and then push our slice of the problem, whether it be journalist, consumer, programmer, government, or corporation, in a better direction.