Analyzing 20 Million FCC Net Neutrality Comments

Matt Miller
5 min readSep 22, 2017

The FCC closed their request for comments on Docket 17–108 “Restoring Internet Freedom” a few weeks ago. This proposal is basically the the current administration’s attempt to end Title 2 protection of the internet as a common carrier utility and will likely lead to diminished net neutrality. With over 20 Million comments summited I was curious if there were any patterns in such a large dataset. Particularly if support for the repeal could be seen geographically (commenters can include their physical address). For example, are pro-Title 2 repeal comments coming from traditionally Republican areas of the country?

The vast majority of comments came from a form or bot submission. A organic comment here is one that is repeated less than 100 times in the corpus

To start with, these comments are a mess. The FCC allows bulk and API comment submissions, and that is what they got. The vast majority of the comments came from a bot or a form submission. I’m basing this on the comment text itself. Only a little over 1 million comments were textually unique. Meaning written out by a real live thinking human being. The rest of them are just the same comment submitted with a different name and address attached to it. In fact, over 50% of the total comments are a comment who’s text is repeated over a million times.

Almost all of the comments submitted were comments who’s text were duplicated thousands, to millions of times.

I really did not want to get into the validation game, trying to figure out if a comment is pro-repeal or against is difficult enough. So I just took the data as is, with one exception. There were over 7.5 million comments with the exact same text: “I am in favor of strong net neutrality under Title II of the Telecommunications Act.” They all had fake emails address (‘’, ’’, ’’, ’’, ’’, ’’, ’’, ’’, ’’, ’’) and had fake physical addresses, all non-existent street address. There were all submitted in huge batches at the exact same time over a couple weeks. They were painfully fake, so I removed them, also removing my confidence I would get anything interesting from this data.

I needed to classify the comments into pro and anti repeal. This would be difficult except, there were literally no pro-repeal comments that were not from a bot or form submission. Out of the million unique comments (only occurring once in the entire corpus) I could not find a single pro-Title 2 repeal comment that looked like it was written by a person. The pro-repeal folks had their form/bot game on point, submitting millions of comments that said the same thing but had a real person’s credentials attached to it. I compiled some examples these comments. My favorite, submitted only one hundred thousand times by different people:

Rapacious Silicon Valley monopolies like Amazon, Twitter and Netflix are now openly partnering with neo-Marxists like Free Press and Fight for the Future to launch phony astroturf campaigns to prevent the rollback of President Obama's 2015 internet takeover. What frightens Americans isn't the rollback of already outdated rules aimed at Silicon Valley's competitors, but rather the complete takeover of the internet by this same handful of leftist companies and their radical leftist allies. These companies are not only censoring our viewpoints, blocking users and competitors online, prioritizing their own services, and destroying our online privacy, they are now even using their unrivaled corporate influence and greed to destroy our news media and free expression. It's time to rollback Obama's disastrous rules designed only to give Silicon Valley free reign over our internet and bolster their monopoly gatekeeper status. If any business sector in America today needs rules, it's Silicon Valley's gluttonous monopolies that are destroying our internet. Please rollback Obama's government takeover of the internet before our free and open internet becomes Amazon, Facebook and Google's private property.Also we are sick and tired of all the CONTROL FREAK elite loons that want to to dominate the people of the UNITED STATES, and for the record, know that the masses WILL NOT TOLERATE this treason against us. Be smart, and go with the people you are supposed to be serving, WE THE PEOPLE.

The anti-repeal folks also had a some bot/form submissions, but far less, and the difference is that these were not the only type of comments submitted. Reading some of the 1 million unique anti-repeal comments written by teachers, small business owners, librarians, students and others was the best aspect of this project.

Fully acknowledging this is a problematic dataset I looked at my original question of geography. Out of all the data I was able to pull out about 7.5M pro-repeal and 4M anti-repeal comments that had a valid US zip code. Orange is pro-repeal, blue is anti-repeal:

Density of pro-repeal (orange) and anti-repeal (blue) comments by zip code

The more comments that originated from that zip code the darker it is. You’ll notice around metropolitan areas, for example New York, there is more anti-repeal blue filling in. Otherwise the maps are fairly similar, they are densest around population centers.

I also looked at the state level, for comments that had valid state codes:

Likewise I looked at email providers, to see if there might be anything interesting there:

For both nothing really intriguing jumps out, a lot of people use Gmail and live in California.

As I mentioned earlier the best comments are those unique one offs written by real people. There are some written by librarians, and some that got pretty creative with emojis. You can download all of these anti-repeal unique anonymized comments here(70Mb). And you can download all the classified comments here(100Mb).

It seems likely that regardless of these public comments Title 2 protection will be removed by this administration. But it is not over yet, hopefully net neutrality will prevail and we can keep Chet happy: