What can your ISP see? Facts to keep in mind as you consider the FCC’s new privacy proposal.

Recent debates about privacy have focused on iPhones and government surveillance. But earlier today, Chairman Tom Wheeler of the Federal Communications Commission (FCC) opened a new chapter: He released an outline of proposed privacy rules for Internet service providers (ISPs).

The proposal, which comes as the FCC implements its net neutrality order, makes this an opportune moment to step back and ask, “What can your ISP see when you go online anyway?”

ISPs — whether Comcast, Verizon, AT&T, T-Mobile, or another company — handle every bit of information that you send or receive on the Internet.

Your ISP can often see where you go on the Web, and the contents of your Internet traffic.

My colleagues and I at Upturn recently published a new paper, What ISPs Can See, that explains some of the technical nuances behind what ISPs can (and can’t) see. Of course, we don’t know exactly what ISPs are or aren’t doing, or what they might plan to do in the future. But we do feel that it’s important for everyone to understand what’s possible.

If you plan to read the FCC’s proposal, here are a few key technical takeaways to keep in mind.

Some web sites encrypt their traffic by default. But plenty of web sites still don’t — and it’s hard for many of them to make the switch.

When a web site enables encryption, you’ll see that comforting green lock icon in your address bar.

This means that your ISP can see far less about what you’re doing online — not nothing, but far less.

However, lots of websites still don’t use HTTPS at all, or don’t use it by default. For example, here’s how top sites in the areas of “health,” “news,” and “shopping” (as ranked by Alexa) stack up:

ISPs can easily see all of the information in the right column.

It will be difficult for many of these web sites to make the switch to HTTPS. One reason is that lots of them rely on “third party” companies to provide them with advertisements, analytics, embedded streaming video, and other useful services. For a web site to fully switch to HTTPS, every one of its third party partners must also already use HTTPS. But many third party services (which load in the background as you browse the Web) have not yet adopted HTTPS themselves.

Here’s a graph of the top 100 news websites, and their third-party partners:

Ad tracker HTTPS support rates on the Alexa top 100 news sites (via Citizenlab)

Until those red bars become green, it’ll be tough for these web sites to fully encrypt (at least, not without dropping partners and potentially taking a hit to ad revenue).

But it’s not just popular websites that are behind the curve: Home devices can be too. Many Internet of Things (IoT) devices, such as smart thermostats, home voice integration systems, and other appliances, fail to encrypt at least some of the traffic that they send and receive.

There are some happier statistics. For example, as Professor Peter Swire and his colleagues highlighted in a recent paper that about 70% of all downstream Internet traffic will be encrypted by the end of 2016. (Of course, this figure includes data hogs like Netflix and YouTube, which together account for almost half of all downstream Internet traffic in North America.)

But the total amount of encrypted data is a poor metric for privacy exposure.

Here’s why: watching the full Ultra HD stream of The Amazing Spider-Man could generate more than 40GB of traffic, while retrieving a single WebMD page about cancer generates less than 2MB. In this case, the WebMD page is 20,000 times less data by volume than the movie, but likely far more sensitive.

In short, today, ISPs can see lots of unencrypted data. This will likely be true for some time to come.

Furthermore, even when you see that green lock icon, your ISPs can usually still see which sites you visit.

Even when you see that green lock icon, your ISP usually still knows where you’re going, at least at a high level. For example, if you visit Facebook (which encrypts traffic by default), your ISP will probably still know you’ve requested the domain “facebook.com,” even though it can’t see what you’re doing on the site. (It’s possible to hide the fact that you’re visiting Facebook entirely from your ISP by using a Virtual Private Network. But most people don’t use VPNs.)

This is a lot like getting to see the phone numbers someone dials. When President Obama created a special advisory group to review government surveillance, the group found that “the record of every telephone call an individual makes or receives over the course of several years can reveal an enormous amount about that individual’s private life.” The same is true — and perhaps even more so — for the web sites that you visit.

A short string of domains can themselves be very revealing. For example:

Now imagine that your ISP could collect every domain that you visit. It’s technically feasible.

More surprisingly, computer science researchers have shown that a network operator can learn quite a bit about the contents of encrypted traffic without decrypting that data. By examining the features of the traffic — like the size, timing, and destination of the encrypted data — it is possible to uniquely identify certain web page visits or other information about what that traffic contains. (I would be very surprised if ISPs are engaging in such practices today. But they could, if they tried.)

The bottom line is this: ISPs can see quite a bit about we do online. As the FCC, industry groups, and privacy advocates consider the new broadband privacy rules, they should remember this basic fact.