Data are Beautiful
Data’s story in grammar
A datum is a single piece of information. There are two plural forms for datum. The lesser known form, datums, is used exclusively in surveying and geodesy. The other plural form surrounds us both physically and figuratively. It can be big or small, right or wrong, new or old, dull or interesting. In the end it’s just a long line of 1s and 0s, stored right here or way over there. I’m talking about data.
Because data is a plural noun, it's technically more correct in English to say “data are”. But in the real-world, using “data is” is fine, especially because it’s considered to be a mass noun. Outside of the real world, there is some debate between “data is” and “data are” (here, here or here, for example). But languages evolve — I’m cool with that. I don’t even care if you say datas, as long as those datas are good.
It turns out that "data is" and "data are" occur about equally, after the strong decline of "data are" in the 80s and 90s. And this data ain't for chatspeak, it's text that was published and hopefully edited, according to Google's N-gram viewer.
Who is more likely to still use the proper phrasing? It's those pesky British English speakers and writers in red that are holding on. (As compared to American English, in blue.)
In all forms of English, an interesting observation stands out: When starting a sentence, the trend reverses. "The data are" and "Data are" are approximately twice as common as "The data is" and "Data is".
I have no idea why, but I'll list any good guess here:
Idea 1. People tend to be more careful with their words at the beginning of a statement...
Idea 2. ________________________
Whatever you prefer, we can agree that Data was, were, is, are, and will be Beautiful. (And sometimes ugly.)
Data - it’s still more popular than sex, drugs, and rock & roll.
But only in books.
Across the entire internet,
sex still wins.
Made it this far? Why not say Hi! on Twitter? @philshem