Why the NSA’s Incidental Collection under Its Section 702 Upstream Internet Program May Well Be Bulk Collection, Even If The Program Engages In Targeted Surveillance

21 min readMay 7, 2016

Section 702 surveillance, which comprises the Prism and Upstream 702 programs, is still shrouded in mystery, despite the Intelligence Community’s post-Snowden pledge to provide more and better transparency.i A lengthy 2014 study by the Privacy and Civil Liberties Oversight Board (PCLOB), declassified legal documents and three DNI transparency reports reveal nothing about the total number of communications the NSA acquires through the Section 702 program, nor about how many American and foreign bystanders might get swept up in the 702 surveillance net. The government consistently has presented Section 702 of the FISA Amendments Act, which targets selected foreigners outside the United States, as a model of targeted, discriminate, selector-based surveillance, the exact opposite of indiscriminate bulk collection. The relatively small “estimated number of targets” affected by Section 702, listed on the DNI’s annual transparency reports (89,138 targets in 2013; 92,707 in 2014, 94,368 in 2015), are cited as proof that the NSA doesn’t broadly access customers’ data, but only collects “a minuscule fraction of the over 3 billion Internet users worldwide.” Over-collection of domestic communications happens rarely, is merely incidental, and hence lawful, as ODNI General Counsel Robert Litt has argued.

Yet, despite the alleged relative smallness and contained nature of Section 702 surveillance, the NSA seems unable to indicate how many United States persons are affected by the program. The NSA and ODNI have invoked technical challenges and privacy concerns to ward off information requests by Senators Ron Wyden and Mark Udall, as well as by a coalition of civil liberties organizations, who charge that the program provides loopholes for the warrantless searches and seizures of the full content of U.S. communications, in violation of Americans’ Fourth Amendment rights. With Section 702 surveillance up for reauthorization in 2017, 14 members of the House Judiciary Committee, fed up with the stonewalling, on April 22, 2016 finally set the ODNI a deadline, demanding that it provide a rough public estimate by May 6 of the “number of communications or transactions involving United States persons subject to Section 702 surveillance on an annual basis.”

There’s reason to believe that the NSA may well not know exactly how many communications of American citizens and residents it incidentally acquires, at least not under the Upstream Internet collection program of Section 702 surveillance, which is so unwieldy and broad that the agency can’t keep tabs on it. A declassified FISC opinion of October 3, 2011 and other evidence suggest that the incidental collection of unauthorized and irrelevant communications of American and foreign non-targets under the Internet section of Upstream 702 is so expansive that it in all likelihood amounts to “bulk collection through targeting.” The NSA has hidden the scale and scope of the substantial over-collection in plain sight, through the use of a euphemism, called MCT, or multi communication transaction, a term that the NSA, judging from the Snowden files released so far, doesn’t use internally. Instead, the term MCT shows up in a legal context, in the NSA’s interactions with the FISC, the secret court that oversees it, as the agency seeks to defend the legality of its considerable over-collection.

Section 702 of the FISA Amendments Act (FAA) authorizes the NSA to the court-sanctioned full take of selected foreigners’ communications as these travel across the U.S. part of the Internet backbone (the 702 Upstream program), or are stored downstream on the U.S. servers of American Internet Service Providers (the Prism program). In keeping with the Fourth Amendment, no American citizens or residents, so-called United States persons (USP’s), may be targeted, and the selected foreigners (non-USP’s), who are targeted because they’re in contact with “foreign intelligence information,” may not reside on U.S. soil. While wholly domestic messages (domestic-to-domestic traffic) may not be intercepted, one-end-domestic communications are considered foreign and may be acquired to track the connections of selected foreign targets to the United States. As such, Section 702 surveillance targets foreign individuals, groups or entities about whom “an individualized determination has been made,” to quote the PCLOB report, but no individual warrants or probable cause are required to intercept the content of their communications, as foreigners off U.S. soil are not protected by the Fourth Amendment. American ISP’s and telecommunication firms are compelled to cooperate with the government in providing access to the full content and metadata of the Internet and phone communications of selected foreign targets. These are targeted by means of NSA-approved “hard selectors,” such as phone numbers or email addresses, which are “tasked,” meaning that the NSA sends the identifiers to the data selection engines stationed at the ISP’s and telecoms for the real-time (“live”) collection of traffic or the data mining of stored messages.

Under Prism, the NSA, with the technical help of the FBI, is said to acquire “to-and-from communications,” sent and received by a target, whose content can be as diverse as video’s, chats, text documents, or VOIP, as a Prism slide shows. But the NSA uses a far wider surveillance net in Upstream Internet collection, where it also extracts “about the target” communications from the Internet backbone, using deep-packet inspection (DPI) techniques. According to the government’s routine account of this “about collection,” communications containing the tasked selector are intercepted, when, for example, the target’s email address or phone number are detected in the body or subject line of a non-target’s email, including those of Americans communicating with foreign targets. However, both the PCLOB report and two FISC opinions indicate that the NSA engages in several different categories of “about collection,” some of which are more privacy-invasive than others, but whose technical details remain classified. The broadness of the “abouts collection” has been one of the program’s flashpoints for privacy advocates. At the 2014 public hearings on Section 702 surveillance, NSA General Counsel Rajesh De admitted that the “likelihood of implicating incidental U.S. person communication or inadvertently collecting wholly domestic communications that therefore must need to be purged” is greater with abouts interception. ODNI counsel Litt, by contrast, seemed to want to contain the over-collection issue by focusing mainly on one-end domestic communications, suggesting that the incidental interception of U.S. persons communications is to be expected and not unique to Section 702, given that it’s logical that the communicants of targets would get scooped up.

While criticism of loopholes affecting American privacy rights in Section 702 surveillance is crucial, the Section 702 program primarily has been viewed through the “Fourth Amendment lens,” so that other issues, such as the over-all size of the program’s over-collection, as it impacts Americans and foreigners alike, by and large have remained unexamined. The evidence in a 2011 FISC opinion, suggesting that the Upstream Internet program of Section 702 amounts to bulk collection, mostly has been neglected, and so has the NSA’s use of the acronym MCT, central to this Internet surveillance program. The 2014 public hearings on Section 702 never focused on the problematic MCTs, and the term subsequently received too little attention in the PCLOB report, which, in its support of Section 702’s contributions to counterterrorism, underplays some of its shortcomings. To get a sense of the scale and nature of over-collection under Upstream 702 Internet interception, one must comb through the fine print of Judge Bates’ dense October 3, 2011 FISC opinion and tease out the hidden numbers, something the PCLOB report omitted to do as it tried to establish that Section 702 by no means amounts to bulk collection.

Director of National Intelligence James Clapper released Bates’ opinion in August of 2013, at the height of the Snowden crisis, as part of a trove of declassified documents, hoping it would provide a “testament to the government’s strong commitment to detecting, correcting, and reporting mistakes, and to continually improving its oversight and compliance processes.” The declassified opinion reveals how, in 2011, FISC Judge John D. Bates found out that the government consistently had mischaracterized the scope and nature of its Upstream 702 Internet collection program to the Court. It was the third time in fewer than three years that the government “disclosed a substantial misrepresentation regarding the scope of a major collection program,” as Bates dryly notes in his opinion. The NSA’s hidden acquisition of Internet communications had started in 2006, well before Section 702 became law through the 2008 FISA Amendments Act, yet it took until May of 2011 for the government to reveal the over-collection problem to the Court.

In 2011, the NSA officially acquired more than 250 million Internet communications through its entire Section 702 Internet surveillance program. Only 9 % of these, or approximately 22.5 million Internet communications, were acquired through Upstream 702, which is why Bates at first calls it a relatively small program, compared to Prism, responsible for 91% of the Internet communications collected. But these official numbers, Bates now learned, were deceptive in light of the new revelations. The official Upstream 702 Internet communications had in fact been collected through the acquisition of millions of “Internet transactions,” “the sheer volume” of which was such that “any meaningful review of the entire body of the transactions is not feasible.” In the first half of 2011, the NSA acquired more than 13.25 million Internet transactions through Upstream 702, or an estimated 26.5 million Internet transactions that year. The decisive word here is transaction, not communication. During Upstream Internet collection, the NSA’s interception engines don’t just catch single communications, as the government says is the case in Prism, but they capture Internet transactions, whenever these include a reference to the tasked selector of the foreign target. Numerous of these Internet transactions, dubbed MCTs, consist not of discrete communications, or single communication transactions (SCTs), but of bundled multiple communications. Such MCTs “may contain data that is wholly unrelated to the tasked selector, including the full content of discrete communications that are not to, from, or about the facility tasked for collection.” Not only had the NSA been hiding its over-collection, but it now appeared the agency also knowingly intercepted Internet communications that weren’t even relevant to the authorized intelligence mission, in other words, communications of simple bystanders, people who neither are in contact with the target nor communicate about the target’s tasked selector. Among the multiple Internet communications snatched up through MCTs were wholly domestic communications, in violation of statutory and constitutional provisions, as well as Internet communications of non-targets wholly unrelated to the target.

The over-collection, the government explained, resulted from “technological limitations” or “challenges” with the NSA’s collection equipment, which “significantly affect the scope of the collection.” Technical details in Bates’ opinion have been redacted, but there’s enough material to suggest that the technical problems were at least two-fold. The first problem applies to U.S. persons and isn’t typical of Upstream 702, but relates to geo-blocking I.P. filters or code that the NSA deploys to prevent the ingestion of wholly domestic Internet communications. Given that the Internet is organized for messages to take the cheapest or fastest route, such I.P. filters or rules won’t block domestic-to-domestic traffic that is routed internationally. However, the second problem, specific to Upstream 702 collection, affecting U.S. persons and innocent foreigners alike, arises from the deep-packet inspection (DPI) technology implemented to intercept Internet packets that match tasked selectors. Rather than being precise and focused, the NSA’s Upstream 702 collection engines broadly capture clusters of Internet packets, not because all the packets entail tasked selectors, but because they happen to travel the same path as a communication that contains a tasked selector. Instead of selecting single communications matching tasked selectors, the NSA’s Upstream 702 collection devices, under certain redacted circumstances, ingest far too much, when the Internet transaction “contains a targeted selector anywhere within it.” Crucially, the broad capture of data doesn’t just take place during “about the target collection,” but happens even during to-and-from interception, in other words, across the entire range of Upstream 702 Internet collection activities. According to the NSA, technical malfunctions are to blame, as the agency’s equipment isn’t able to distinguish between SCTs and MCTs when packets are being snatched from the Internet backbone. The MCT problem doesn’t affect Prism, where other collection methods are deployed to acquire discrete communications, according to the NSA.

Bates wasn’t just troubled by the over-collection under Upstream 702 but also by the NSA’s tendency to “over-retention.” The NSA was in the habit of hanging on to the by-catch, he noted, instead of purging U.S. persons’ communications or irrelevant communications, wholly unrelated to the foreign target and foreign intelligence mission. However, Bates only worried about how the over-collection might impact American citizens and residents, in keeping with U.S. jurisprudence, which traditionally only focuses on the rights of U.S. persons. When the NSA presented him with a manual review of a statistical sample of Internet transactions, all Bates wanted to know was how many wholly domestic messages might have gotten scooped up. Based upon a random sample of 50,440 transactions, taken from the 13.25 million Internet transactions acquired during the first half of 2011, Bates calculated that, under Upstream 702, the NSA in one year might have ingested between 2,000–10,000 MCTs that contained at least one wholly domestic communication, a number augmented by approximately 46,000 wholly domestic “about” SCTs, which the NSA intercepts when domestic communications are routed internationally, eluding geo-blocking filters. The number of wholly domestic communications was small in relative terms, Bates ruled, but large in absolute terms.

Bates accepted that the NSA was incapable of changing its interception technology. Not the collection methods needed to be changed, but the minimization procedures, which define how the agency is to handle raw data and disseminate intercepted communications to minimize privacy intrusions resulting from its collection activities. The NSA agreed to sequester MCTs that contained wholly domestic messages upon recognition, and Bates ruled that the retention period for the entire Upstream 702 acquisition would be slashed, reduced from five to two years, because of the heightened risk of privacy violations for USP’s, showing once again no concern for the privacy of foreign bystanders entangled in the Upstream 702 surveillance net. The agency also pledged not to conduct U.S. persons queries in the Upstream Internet collection, indicating that the number of U.S. persons’ communications in the Upstream 702 databases was large enough for the risk of mining unauthorized U.S. data to be real, not hypothetical. However, because of Fourth Amendment violations present in past Upstream 702 acquisitions, the NSA eventually decided to purge all Upstream 702 data it could reasonably recognize as having been collected prior to November 2011, when the new minimization procedures, meant to deal with the wholly domestic communications, came into effect. After the NSA had adjusted its minimization procedures, Judge Bates finally legalized the previously unlawful over-collection, turning it into the legally acceptable “incidental” collection of U.S. persons’ communications. (Since then, the NSA has placed limits on the analysts’ use of foreign non-target communications in certain MCTs, but this is a matter of policy, not of law, and the NSA is still in the process of devising rules for what it in future will do with some of its Upstream 702 Internet by-catch.)

Crucially, Bates’ unique concern with the rights of U.S. persons allowed him to skirt the question whether the over-collection in general, affecting American and innocent foreigners alike, was reasonable and proportionate. To get a sense of how large the Internet over-collection under Section 702 over-all might have been in 2011, one must once again turn to the MCTs and unpack the hidden numbers. The NSA established that 90% of the statistical sample consisted of SCTs, while 10 % were MCTs. Applied to the estimated 26.5 million Upstream Internet transactions of 2011 this means, that a conjectured 2.65 million Upstream 702 transactions were MCTs, while 23.85 million were SCTs. On the face of it, the SCTs are larger than the MCTs, but all depends on the potential size of MCTs. Do MCTs on average consist of 3, 5, 10, 20, 100 or more bundled Internet communications?

Unfortunately, Bates never ordered a study of the average size of MCTs and never wondered what the percentage of irrelevant communications, devoid of, or wholly unrelated to, a tasked selector, might amount to. All there exists, as a practical example of an MCT is an account by ODNI counsel Litt. In the summer of 2013, during a background conversation with journalists about Bates’ opinion, Litt described what might happen to an HTTP-based webmail download, consisting of 15 communications.

“There is a certain kind of communication, MCT, where there are several communications bundled together. One example is a webmail email account. Like when you open your account, you will get a screenshot of some number of emails that are sitting in your inbox, date, sender, subject, line, size. You may get 15 at a time. Those are all transmitted across the Internet as one communication at one time, even though there are 15 separate emails mentioned in them. And for technological reasons, the NSA was not capable, and still is not capable, of breaking those down into their individual components. You have a situation, where one of those emails may have referenced your targeted email, but you’ve nonetheless collected the whole inbox list together. It’s like a screenshot. You don’t get the whole email; you get whatever is popping up on your screen at the time, that comes as one communication.“

During the download of webmail, as a user interacts with a server, the NSA may capture all the packets in the transaction, if one email references a tasked selector. While Litt here briefly references the technical imprecision in Upstream 702 Internet interception, as devices scoop up unrelated communications, he swiftly moves on to the case of U.S. persons and how they might get caught in the net, indirectly suggesting that this is primarily a factor of communicating with the target.

“On occasion, some of those [communications] might prove to be wholly domestic, for example, if you are targeting a foreign person and that person is in communication with a U.S. person, you are going to get, you can get all of that U.S. person’s screenshots. So there may be other communications, which are between U.S. persons which are wholly domestic communications that we are not allowed to collect under section 702.”

Litt’s explanation thus conflates two distinct issues. A recently declassified 2015 FISC opinion, by Judge Thomas F. Hogan, confirms that communications of U.S. persons and, by extension, of innocent foreign non-targets (Hogan, like Bates, is not concerned about them) can get swept up in two ways: because the non-target communicates with a target and/or references a tasked selector, or simply because a non-target communication “unavoidably” becomes part of an Internet transaction that references or contains a tasked selector. In other words, the second condition need not be dependent on the first.

Even if the glitch of the MCT happens only “on occasion,” to quote Litt, say 1 in 10 times, it results in sizeable over-collection. If one follows Bates’ method of conjecturing estimates by means of the statistical sample and multiplies the 2.65 million MCTs of 2011 by 15 (the number of communications in Litt’s webmail example), the result is that the MCTs contain 39.75 million communications. That sum isn’t only larger than the SCTs (23.85 million) but is also almost double the size of the official number of 22.5 single Upstream 702 Internet communications that the NSA initially declared to the FISC for calendar year 2011. To be sure, the average number of bundled Internet communications in such MCTs could be smaller or considerably larger than 15. And we regrettably have no sense of what the relation of relevant to irrelevant data is, since Bates failed to order a study of the ratio of hits to non-hits, or communications that match the tasked selector as opposed to irrelevant communications that don’t contain a tasked selector and/or don’t respond to a tasked selector.

But even without such conjectures about estimates, it seems clear that Bates’ surprise at the scope of the hidden Internet acquisition under Upstream 702 points to the substantial over-collection of unauthorized and irrelevant content data. The NSA’s explicit admission to Bates that technological limitations “significantly affect the scope of the collection” likewise points to significant numbers of technically induced over-collected communications, and, frankly, so does Litt’s own webmail example, which suggests that entire inboxes of non-targets, wholly unrelated to the target, may get gobbled up. In addition, webmail is just one example of a series of classified technical situations in which such over-collection occurs. Clearly, none of this matches the narrative of smart targeting under Section 702 that the NSA has been promoting publically, nor does it fit with the PCLOB’s conviction that Section 702 doesn’t amount to bulk collection, simply because it targets individuals, groups or entities about whom an individualized determination has been made.

Although the PCLOB report criticizes shortcomings in Upstream 702 collection, the board fails to analyze the substantial over-collection through MCTs, having decided that it is satisfied with the status quo in light of the program’s counterterrorism benefits and NSA’s current technical capability. But the PCLOB’s over-all advocacy of the program for counterterrorism purposes at times results in muddled analysis. The watchdog board fails to explain fully the discrepancy between officially collected Internet communications and unofficially acquired Internet transactions, so much so that the report, which is cited as an authoritative text in surveillance court cases, has led to consequential misinterpretations of the program. One oft-cited passage in particular has caused confusion.

“To identify and acquire Internet transactions associated with the Section 702-tasked selectors on the Internet backbone, Internet transactions are first filtered to eliminate potential domestic transactions, and then are screened to capture only transactions containing a tasked selector. Unless transactions pass both these screens, they are not ingested into government databases.” (Italics added)

Not only does this passage fail to remind the reader, at this crucial point, that the I.P. filters don’t work for internationally routed domestic traffic. But the final sentence, claiming that only transactions passing both screens make it into government databases, is nonsensical, if not disingenuous, given that MCTs contain plenty of non-target by-catch that should never make it into government databases in the first place. The confusing passage has been misquoted in two recent court rulings about Upstream 702, in which the judges, swayed by the government’s narrative that Upstream 702 is precise and targeted, advocate the factual falsehood that only communications not blocked by the double screen are retained, as they, ironically, object to the plaintiffs lack of knowledge of the program. In the Jewel case, Judge White simply alters the passage, replacing “transactions” with “communications,” to claim that “Internet communications are filtered in an effort to remove all purely domestic communications and are then scanned to capture only those communications containing the designated tasked selectors. …‘Unless [communications] pass both these screens, they are not ingested into government databases.”” In the Wikimedia case, Judge Ellis III in his October 23, 2015 memorandum opinion rewrites another quote from the PCLOB report, leaving out the word “transactions,” to charge that “[o]nly those communications … that contain a tasked selector go into government databases.” The phrase is a misquotation of a confusing line in the PCLOB report, which reads: “Only those communications (or more precisely, ‘transactions’) that contain a tasked selector go into government databases.”

It is one of the myths mainstreamed by the NSA, but also by the GCHQ, that selector-based surveillance is the opposite of bulk collection. The binary logic promoted by the agencies assumes that because a selector is a discriminant, surely, collection activities based on selectors can’t be indiscriminate. In the UK, the GCHQ in particular has been successful in persuading the former Intelligence and Security Committee of Parliament (ISC) of this argument, but advocates of the Section 702 program have embraced a similar narrative. In reality, bulk collection via selectors may happen in Upstream 702 Internet collection as data in the physical vicinity of communications with a tasked selector are also scooped up. There doesn’t need to be an actual link to a target, digital proximity to a target can be enough. Typically, bulk collection refers to the indiscriminate bulk capture of raw data, when everything passing a fiber-optic cable, switch or router is blindly intercepted, without the use of discriminants. But this is only one possible definition of bulk collection, albeit the better known one. However, there exists a second definition of bulk collection (or bulk acquisition), which, ironically, shows up in a government report on bulk collection, though the report fails to spell out its implications for Upstream 702. (All it is allowed to provide are hypothetical examples.) According to this second definition, bulk collection occurs when “a significant portion of the data” in a database “pertains to identifiers not relevant to current targets.” Bulk collection, then, can occur even as a consequence of targeted collection activities, when sizeable numbers of communications of non-targets, wholly unrelated to the target, are intercepted, meaning that their communications neither reference the target nor are a reply to a target communication. This second kind, which I would call “bulk collection through targeting,” happens when collection equipment broadly captures data beyond the target, as seems to be the case with the NSA’s imprecise Upstream 702 interception devices, or when tasked selectors are defined too broadly, as when I.P. ranges or I.P. addresses of computer servers rather than individual email addresses are tasked and substantial volumes of data are ingested. We know that the NSA, contradicting the routine examples it uses to suggest it only targets individualized selectors, targets broad selectors, as confirmed by Snowden files and a 2014 Washington Post article dedicated to a Section 702 trove of data, which the newspaper received from Edward Snowden. The Washington Post article doesn’t tackle the technical issue of MCTs nor does it explain the ratio between Prism and Upstream data in the Hawaii 702 trove. However, it provides circumstantial evidence for the thesis that Section 702 doesn’t just map the contact chaining between targets and non-targets or “about the target” collection, as critics and counsel Litt were quick to charge when the article first appeared, but that it scoops up substantial chunks of communications of “digital bystanders.” Still, in its public communications about Section 702, the NSA successfully has deflected from the hidden bulk collection in Upstream Internet collection by promoting the stark opposition between targeted and raw bulk collection. True, under Section 702 the NSA doesn’t engage in the blanket dragnet collection of indiscriminate raw data, as it does under the expansive full-take bulk collection programs of foreign traffic that are sanctioned by the Transit Authority and Executive Order 12333. Nevertheless, a careful reading of the 2011 Bates opinion shows that the Upstream 702 Internet program amounts to bulk collection of the second type, bulk collection through targeting, affecting Americans and foreigners alike. In public, the practice has virtually gone unnoticed, thanks to the under-examined euphemism MCT.

To this day, the NSA continues to use imprecise interception technology in Section 702 Upstream Internet targeting to catch more than it needs. Judge Hogan’s November 6, 2015 FISC provides evidence that the “unavoidable” capture of non-targets through MCTs goes on, and so does a recent 2016 PCLOB “Recommendations Assessment Report,” published on February 5, 2016. Although Bates and the PCLOB have encouraged the NSA to improve its technology to ensure that 702 Internet acquisitions are limited to targeted communications, the NSA’s most recent internal review concluded that there exist no technological alternatives to its collection devices right now. Of course, Bates’ 2011 legalization of the over-collection under Upstream 702 practically means that there’s no incentive for the NSA to be more precise in its collection methods. The NSA continues to collect broadly, any restrictions are supposed to come from the minimization procedures, according to the FISC. Moreover, the 2011 Bates ruling also discloses an odd logic, according to which the NSA can first design interception technology that collects too broadly, and then subsequently justify its broad collection by invoking the “irresistible force,” or “force majeure,” of its technology. The NSA is at the mercy of its own “unavoidable” technology; it’s all it has got to work with for the time being. Added to this is the fact that the NSA conducts its own review to determine what its technological alternatives are, without external oversight, which is why whistleblower Bill Binney recently has called for the technical monitoring of the NSA’s technology and interception apparatus. Furthermore, a passage from the Washington Post’s article on Snowden’s Hawaii Section 702 trove indicates that the over-collection under Section 702 might not just result from technical imprecision in its Upstream Internet section or analysts’ prudence. In the passage, an analyst manually scoops up the communications of every person posting in a chat room, “regardless of subject,” including “digital bystanders,” as well as passive lurkers and browsers of the URL. The practice reflects a policy in keeping with the over-collection and over-retention attitudes that Judge Bates criticized in his 2011 opinion, two years before the NSA’s “collect-it all” philosophy was publically exposed by the leaked Snowden files.

For all these reasons, it doesn’t seem prudent for the House Judiciary Committee to ask the ODNI for the number of “transactions” affecting U.S. persons in NSA 702 collection activities. Reacting to the Committee’s demand for a “rough estimate,” James Clapper on April 25, 2016 indicated that his office is “looking at several options right now,” but he persisted in the narrative that disclosures about the numbers of U.S. persons affected by Section 702 might involve investigative methods that are too privacy-invasive. To ask the ODNI, then, for a number of “transactions” might mean to give the agency the opportunity to hide the real numbers of the over-collection of American and foreign communications behind the euphemism MCT. The first thing that the NSA may need to clarify, is whether its Internet collection under Upstream 702 is a bulk collection program in disguise, for which the term “incidental collection” is a misnomer.

Postscript: The Numbers in Bates’ October 3,2011 FISC Opinion

1/COMMUNICATIONS

Total collection of Internet Communications under Section 702 in 2011: more than 250,000,000 communications

91% of these are PRISM Internet Communications or 227,500,000

9% of these are UPSTREAM Internet communications or 22,500,000

2/TRANSACTIONS

Estimated collection of Internet transactions under Upstream 702 first half 2011: 13,250,000

Estimated Internet transactions under Upstream 702 for entire 2011: 26,500,000

3/SCTs AND MCTs (Single Communication Transactions and Multiple Communication Transactions)

Distribution of SCTs and MCTs among 26,500,000 Internet transactions based on NSA statistical sample is 90% SCTs and 10% MCTs

10% MCTs for 2011 equals 2,650,000 MCTs

90% SCTs for 2011 equals 23,850,000 SCTs

Unknown: the average number of communications ensconced in MCTs

4/ESTIMATED NUMBER OF ACTIVE USERS OF MCTs

Foreign target, outside U.S.A: 300,000–400,000 MCTs

U.S.P non-targets, inside U.S.A.: 7,000–8,000 MCTs

Foreign non-targets, outside U.S.A.: 1,300,000–1,400,000 MCTs

Unknown: 97,000–140,000 MCTs

i Several authors and privacy advocates have provided invaluable work that has helped to lift the veil on the legal technicalities of Section 702 surveillance, especially Laura K. Donahue, the lawyers and investigative teams of the EFF, Jameel Jaffer of the ACLU, Rachel Levinson-Waldman of the NYU Brennan Center, as well as senators Wyden and Udall, to name but a few. I hope to contribute to the ongoing debate about Section 702 surveillance by showing that the Internet program under Section 702 amounts to bulk collection of the second kind, “bulk collection through targeting.” I focus on collection problems in the Internet part of Upstream 702 (not on the telephony section or on Prism) and, at this point, don’t discuss minimization issues or U.S. persons queries. This article is the shortened version of a longer piece on Section 702 surveillance and forms part of a book-length study on privacy and surveillance. The article was completed on April 29, 2016, then edited and updated to reflect the release of the DNI’s third transparency report on May 2, 2016.

To contact the author, write to “bhanssen.author” at gmail

Why the NSA’s Incidental Collection under Its Section 702 Upstream Internet Program May Well Be Bulk Collection, Even If The Program Engages In Targeted Surveillance

Written by B Hanssen