On People and Service in Turbulent Environments
While I am currently Executive Director for the People-Centered Internet coalition, previously for four years I served as digital diplomat and “human flak jacket” as CIO for the FCC, first under Chairman Wheeler and then briefly under Chairman Pai before Vint Cerf invited me to join PCI. This service was in a non-partisan capacity.
Apparently today (05 June) a Gizmodo reporter wrote an article that claimed claims of a denial of service at FCC back in May 2017 were untrue — I found out about this not because the reporter had contacted me, rather a friend let me know. At the time of writing this the reporter has not contacted me for a comment before he wrote the story, which is disappointing. Today I was having lunch with a former FCC friend (a Democrat who served under Chairman Wheeler too) who shared the article that popped up in his feed while we were having lunch.
First, let me acknowledge there is a lot of angst about the future of Net Neutrality (or lack thereof) and I understand the frustration. I joined Vint Cerf and the Team at the People-Centered Internet coalition because I am concerned about the future of the Internet and ensuring it is more people-centered.
Second, I support freedom of information requests and am glad to live in a country where these are done. At the same time, it is disappointing to read an article that misinterprets emails and says I have not “responded to requests for comment” when as of the time of typing this I haven’t received an email or phone request from the reporter who wrote this article today. Again, this first popped-up in the middle of lunch with a former FCC friend today.
Third, whether the correct phrase is denial of service or “bot swarm” or “something hammering the Application Programming Interface” (API) of the commenting system — the fact is something odd was happening in May 2017. As background, the commenting system has an API that allows automated submission of either queries of previously submitted comments or submission of new comments. This was a feature requested by outside stakeholders in 2015 in lieu of web scrapping the system. Also, though the Notice and Comment process is supposed to raise new, novel concerns and issues that the Commission must respond to (it is not a vote) it is troubling to know about the large number of fake comments. When I saw the former New York Attorney General’s press release in late autumn 2017 airing concerns that people’s identities may have misused, I called that same day to offer what I could do to assist as a private citizen because by then I had left the FCC. I made three different calls to help give background because the NY AG had not provided a direct number for the group investigating, yet eventually I did get a call back and we discussed the concerns. Where possible in a non-political way I have done my best to help.
2017: Back in May 2018, my biggest concern was actual people might not get to leave a comment if what appeared to be an influx of automated submissions precluded their ability to do so. In April 2017, an outside entity had told the FCC’s Help Desk they had 250,000 comments a day already prepared to submit for multiple days — which was quite out of the ordinary. As a reference, 250,000 was about as much as the highest amount we ever saw in *total* back in 2014 for an entire day of all comments when everyone was commenting on a different high-profile proceeding. This information made me highly concerned that if these already prepared comments the outside entity said they had were not submitted in bulk, such automated submissions using the commenting system’s API might deny system resources from legitimate individuals also trying to submit their own comments at the same time. I suggested to the FCC Help Desk that the outside entity submit the 250,000 comments/day they had prepared on a CD or some other bulk mechanism — though later research by the IT Team suggested the entity still opted to submit them each individually using the API extensively.
Separately, a brief RSS flood of 20,000 RSS queries in just 45 minutes was reported by the Team to have overwhelmed the commenting system on Monday, 01 May. This too was also quite out of the ordinary and appeared to me to be another case of automated requests attempting to overwhelming the system relative to the more normal patterns of human use of the comment system. Was this a non-traditional denial of service or an RSS flood that prevented system resources from others seems to be semantics?
When the events of 08 May 2017 happened, my quick analysis of the ratio of 35,000 API requests per minute we were receiving per minute, relative to the number of 90,000 comments being filed in the first half of the day around 2pm, indicated that ratio to be extraordinarily high and lopsided (the Team also relayed that the API requests were continuing to increase, so we were seeing at least 2 million API requests per hour around the middle of the day — yet not a similar number of comments being received). If folks want evidence, there were emails sent by the Team as their analysis of the logs observed the API requests per minute continue to spike. As background, the normal API level observed by the Team was about 400 requests per minute. The very early morning of Monday 08 May saw that grow to 9,600 requests/min. It rapidly continued to grow 14,000 requests/min. API servers added multiple times and it grew to 35,000 requests/minute (more than 2,000,000 API requests/hour).
Separate from actual people wanting to comment, I was concerned we were also being spammed by something automated performing the massive API requests. If this continued, it might deny system resources, at the application layer vs. network layer, from actual people wanting to comment on the high-profile issue. This was my biggest concern.
Some have said the FCC lacked an analysis of that day — if they are expecting a full-blown report then yes, the entire Team was focused foremost on adding additional cloud-based API servers so that actual people could comment. I considered the quick assessment of the API requests per minute relative to actual comments received — combined with concerns that at least one entity had said they had 250,000 comments a day prepared for a proceeding and the RSS flood the exact week prior — to be the analysis I did in the fluid situation. My entire focus throughout the entire turbulent time was on ensuring actual people could continue to comment. At the same time, we had been told we had to accept everything being submitted to the commenting system even if it might be spam traffic.
That Monday, 08 May about 353,000 comments total were received and the Team reported the API spiked at 60,000 requests per minute. After the statement raising concerns about something other than actual people wanting to leave comments, two days later we successfully received similar total number of comments on Thursday, 11 May *excluding the bulk upload option* without any reported issues and an average number of *lower* API requests/minute than we observed on 08 May — as in significantly lower. The bulk upload option provided another 2x comments for 11 May 2017.
The entire non-partisan Team and I wanted to ensure individuals could leave a comment and their voices could be heard. I do think something odd happened in 2017 with the API being flooded abnormally on 08 May, beyond just people wanting to comment and if not stopped this could have prevented actual people from commenting. The sudden drop in API requests the day after does suggest something beyond actual people wanting to comment was occurring. Whether the term could have been bot swarm or API spam flood or something else, I do believe the reported observations from 08 May supported the analysis of something odd happening that looked like a denial of service to the commenting system. Also, the Team and I were not in a position to determine real vs. non-real comments and the system’s “open by design” approach, that included an API to accept automated submissions and queries, precluded use of a CAPTCHA test. The 2017 commenting window was extended an extra two weeks to ensure every person who wanted to comment could — and the commenting system was up and available an estimated 99.4% of time during the entire Notice and Comment process. We were able to handle receiving more comments in the first 10 days in 2017 of the commenting than the first 110 days in 2014.
2014: The same article from 05 June that lacks the context of what occurred in May 2017 also dives into what happened in 2014 — where again, we were seeing an abnormally high number of record-locks to the database, preventing new connections from being established. Back then, the security team reported abnormal HTTP requests, suggesting of web scrappers or “bots” hitting the website. Though the Team reported no security indicators that suggested a traditional DDoS on 02 June 2014 to the network, the Team did report “Aggregate combination of excessive RSS requests that have Date Parameter Violations in their GET request + The increase in Net Neutrality comment uploads to the ECFS that are using unauthorized methods (GET instead of POST) are the possible cause of the perceived slowness” which suggested automated web scrappers might be consuming excessive amounts of database resources and denying actual humans also wanting to comment using the database. Whether denial of service or poorly formatted web scrappers or something hammering the system, the primary concern I had was that wasn’t just a technical issue on our end or a flood of lots of people — rather, I was concerned something(s) automated were excessively locking up the database and preventing actual people from commenting.
The Team implemented various fixes on the fly throughout 2014 and the record dead locks that denied new connections from being established would periodically spike throughout the commenting period—including a time in July at a time when most of the U.S. would be asleep. Also, there was another time in September with an IP address, located outside the U.S. and listed on Ars Technica’s top 10 spam addresses, that flooded the commenting system with abnormally high traffic and coincided with record dead lock spikes immediately even after resets. That was clearly not normal not just a technical issue on our side. The record dead locks spikes were coincided with massive bot activity consuming the commenting system’s resources in abnormally high percentages. If folks want evidence that this happened, records from the summer of 2014 should exist in email form.
These events appeared to be denying services to actual people wanting to comment at the application layer vs. the network layer and could have been a web scrapper run amok or something else. A detailed Deputy CIO from DHS to FCC concurred that this would be considered an attack, though possibly unintentional. As for the Gizmodo article, it lacks the context of us observing the record dead locks, web scrappers, and periodic spikes in high traffic (from outside the U.S.) that coincided with database denial of service effects. It also references where I used inaccurate shorthand saying the Chairman for what in full context were a series of discussions between IT leadership and the Office of Media Relations under the Chairman at the time. Concerns of the risk of copycats came from the Deputy CIO and I at the time. During the discussions with the Office of Media Relations about what to share in 2014, the actual Chairman was not involved, and I take responsibility for the inaccuracy in email.
To address the record dead locks, we did implement a batch upload solution in 2014 for the commenting system working with FCC’s Gigi Sohn and later an API under Chairman Wheeler to discourage web scrappers from consuming too much resources and potentially denying computing resources from actual people wanting to comment. Also, in 2014, in response to the high-profile proceeding, an Internet group proposed “Operation Netstorm” against the FCC and we experienced incorrect news claiming the FCC’s phones were flooded with too many calls about the proceeding, yet when we checked we could not find this was occurring. I do still think there were odd events that denied resources to actual people wanting to access the commenting system in 2014 at the application layer, and there should be email records of these observations and concerning events. It was a turbulent time and we were trying to do our best to make sense of what was happening. The non-partisan Team members were simply trying to do the best possible given constrained circumstances to ensure actual people could comment.
Improving the Commenting Process: I also was told back in the 1980s the FCC would occasionally be flooded with mimeographed comments for a high-profile proceeding. As the commenting process is “open by design” — perhaps the biggest lesson from 2017 is that we now need to update the original Notice and Comment process to operate better on the Internet and hopefully more people-centered in how people debate high-profile national issues of public concern online. The original process that was based on the postal system no longer fits our digital era. One simple step would be to turn off receiving automated and batch submissions using the API and to actively block web scrappers that consume excessive resources from the database or system as a whole.
As noted, in the past outside third-party groups pushed FCC to do automated submissions which prevented CAPTCHA tests of whether the submitter is an actual human (or not). If there is a common theme to 2017 and 2014, the automation of submissions seems to be a problem if done excessively and can appear to deny service to actual humans. So now may be the time to revisit this and other verification procedures may be needed as long as such procedures remain inclusive and accessible to every actual person wanting to make a comment. Even CAPTCHA tests nowadays can be overcome by skilled image recognition techniques and we have already seen examples this year where machines can sound like a human making a phone call.
One last closing comment — recognizing the importance of a more People-Centered Internet and the open process of freedom of information act requests, there was a draft FCC blog post I wrote raising concerns about bots potentially crowding out humans on 09 May 2017. There were multiple times that summer, before I joined the People-Centered Internet coalition, where I made an effort to have this published by the Commission because of my concern that if the bots got to be too strong in intensity that would deny service to actual people wanting to comment. I wish the reporter had included that draft blog post in his article today and/or actually emailed me to ask for a comment. For those of us who join public service as non-partisans, most of us do so out of a sense of service to the public that transcends the politics of each party. My focus throughout my time at the Commission was on serving people as best we could with the resources available in the turbulent environments we faced.
Hope this helps.