Last week the NSA’s bulk collection of American’s phone metadata was ruled illegal. In this latest development in the case ACLU v. Clapper, the Second U.S. Circuit Court of Appeals ruled that the National Security Agency’s program of bulk data collection about U.S. citizens’ telephone calls was unauthorized under Section 215 of the PATRIOT ACT. In other words, the Court found that the NSA exceeded the scope of what Congress authorized. This is, I think it’s safe to say, a pretty big deal.

Now, I’ll admit right up front that I’m not a lawyer. So I’m not really qualified to discuss the legal implications of this ruling. But I am an information scientist, and I’ve taught courses and written a book on metadata. So I think I’m reasonably qualified to discuss the information science implications of this ruling. Of course, these days, information science and the law are so intertwined that it’s difficult to talk about the former without talking about the latter. But I’ll try to stick to what I know.

If you’re interested in the NSA’s program of bulk metadata collection (and you must be or you wouldn’t be reading this), you should at least skim the Court’s ruling. It’s 97 pages long, but I have to tell you, it’s really worth reading. I don’t know about you, but I don’t usually think of court rulings for their shining prose. But I have to give credit where credit is due, the Second Circuit really got some doozies in, and the way I read it, spanked the government petty hard.

I won’t go into a lot of background on the NSA’s data collection, since you’re probably already familiar with the story. The short version is that Edward Snowden met with journalists from The Guardian in Hong Kong in May of 2013, and handed over a large number of classified documents about the NSA’s surveillance program within the US. Among these documents was a Foreign Intelligence Surveillance Court (FISC) order that directed Verizon “to produce call detail records, every day, on all telephone calls made through its systems or using its services where one or both ends of the call are located in the United States,” and to provide these records daily to the NSA. At the time of Snowden’s meeting in Hong Kong, this order had been in place for years, since at least May 2006.

Needless to say, this was very big news when The Guardian published the story. And led to a lot of discussion about the legality of the NSA’s program, including two entirely divergent court opinions: one which stated that the NSA’s bulk metadata collection program is lawful, and the other that it is not.

The key to understanding this difference of judicial opinion is in understanding what exactly is being collected. The FISC order required Verizon to produce “telephony metadata,” but not, importantly, the content of the call itself.

Recording a call is considered wiretapping, and wiretapping has required a warrant since the 1967 US Supreme Court case Katz v. United States. If the NSA had ordered Verizon to record all calls “made through its systems or using its services,” first of all that would have been a much larger volume of data, being the full audio of millions of calls. But more importantly here, such an order would have to have been backed up with a warrant.

Capturing the metadata about the call, however, is perfectly legal. A pen register is defined in the US legal code as “a device or process which records or decodes dialing, routing, addressing, or signaling information transmitted by an instrument or facility from which a wire or electronic communication is transmitted.” In other words, a pen register collects metadata about phone calls — or any other form of electronic communications, be that telegraph messages, emails, text messages, or anything else. Importantly, pen registers do not collect “the contents of any communication,” which would be considered wiretapping. And by ignoring the contents of the communication, by only collecting the metadata about the communication, pen registers may be deployed without a warrant.

The legality of deploying a pen register without a warrant was established in the 1979 US Supreme Court case Smith v. Maryland. The police had deployed a pen register to record the phone numbers dialed from Smith’s home. In the trial, Smith’s lawyers moved to suppress the evidence from the pen register, but unsuccessfully. The Court “held that a person has no legitimate expectation of privacy in information he voluntarily turns over to third parties” — such as the telephone company. This “third-party doctrine” says that all of the data that you voluntarily turn over to a third party can be collected by law enforcement without a warrant, with no violation of the Fourth Amendment.

This distinction between “the contents of any communication” and the “dialing, routing, addressing, or signaling information” about that communication — the distinction between content and data about that content — is where metadata becomes a legal issue. Metadata is commonly defined as “data about data.” Which is actually not a very good definition (what is data? what does “about” mean?), but it’s good enough for government work, as they say. If the content of a phone call is the voice connection itself, then any data about that connection (the origin and recipient of the call, the duration of the call, the location of both phones if they’re cell phones) is metadata.

The problem with our good-enough definition of metadata comes when you realize that to Verizon, this isn’t metadata, it’s the operational data that enables the entire system to work. Verizon couldn’t bill you if it didn’t capture data about what phone numbers you called and how long those calls lasted. The cell network couldn’t route a call to your phone if it didn’t capture data about what cell tower your phone is closest to. To be fair, the definition of a pen register in the US Code specifically excludes “any device or process used by a provider… for cost accounting or other like purposes in the ordinary course of its business.” So whatever device Verizon uses to capture this data, it’s not called a pen register. But it’s the same data, whether captured by Verizon and called operational data or by law enforcement and called telephony metadata.

In other words, what’s data and what’s metadata depends on where you’re standing. And it only gets worse, since the very same stuff that’s metadata in one context can be used as data in another context. Footnote #1 in the Second Circuit’s ruling refers to the study Unique in the shopping mall: On the reidentifiability of credit card metadata, which shows just how easy it is to infer personally sensitive data from credit card transaction metadata. And this is only one of a recent spate of studies on this very subject, including Unique in the crowd: The privacy bounds of human mobility, and MetaPhone: The sensitivity of telephone metadata. The collective takeaway from these studies is that it’s possible to infer a really remarkable amount of personal information from “only” metadata.

The Second Circuit’s ruling states that “The government argues that § 215 of [the PATRIOT ACT] authorizes the telephone metadata program.” Section 215 of the the PATRIOT ACT (“Access to records and other items under the Foreign Intelligence Surveillance Act”) states that the Director of the FBI may apply for a court order “requiring the production of any tangible things (including books, records, papers, documents, and other items) for an investigation.” The word “warrant” appears nowhere in Section 215, so “requiring the production” clearly means upon request, and not upon presentation of a warrant.

To be fair, the third-party doctrine, which is what allows for “production of any tangible things,” makes perfect sense. In order to set up an account with the telephone company, you must provide the telephone company with some personal information: at a minimum, your name and address, and payment information. In order for the telephone network to route your call, you must provide it with a number to route that call to. These pieces of information are necessary for the mere operation of the telephone service. If you want a personalized service, such as your own phone number, you must give up some personal privacy to the organization providing that service. That’s the deal Smith made with the phone company in 1979, and it’s the deal we make today.

The problem is, of course, that we now provide a lot more personal information to a lot more third parties than Smith ever did in 1979. And another problem is that in this era of digital communication, it’s difficult bordering on impossible to say where “the contents of any communication” ends and “dialing, routing, addressing, or signaling information” about that communication begins.

Yet deploying a pen register — or some technologically evolved version of a pen register — to collect that data is still legal without a warrant.

Supreme Court Justice Sonia Sotomayor has written that the third-party doctrine is “ill-suited to the digital age, in which people reveal a great deal of information about themselves to third parties in the course of carrying out mundane tasks.” Sotomayor is only one of many who believe that it is time to revise the third-party doctrine in light of current technology.

And it seems that the Second Circuit’s ruling may be a significant step in that direction. Again, I’m not a lawyer, so I’m not qualified to speculate about what may or may not prove to be precedent-setting. But as an information scientist, it seems to me that the Second Circuit’s ruling changes the legal status of metadata. If the bulk collection of metadata is authorized under Section 215 of the PATRIOT ACT, then metadata has the same legal status as “books, records, papers, documents, and other items” that do not require law enforcement to get a warrant to collect. On the other hand, if the bulk collection of metadata is not authorized under Section 215, then metadata has the same legal status as “the contents of any communication,” and collecting metadata has the same legal implications as wiretapping. Seems to me, as a non-lawyer, that the Second Circuit’s ruling may have set legal precedent, providing metadata with equal status to data.

In other words, what’s data and what’s metadata depends on where you’re standing. The law may be catching up to what information scientists have known all along.