Twitter and the Oxford English Dictionary

Richard Holden explores the history of the hashtag

Although Twitter (maximum 140 characters) and the Oxford English Dictionary (OED) (over 350 million characters) may seem like strange bedfellows, the former has recently become an integral part of the latter: for the first time, the OED has included individual Twitter posts as part of its quotation evidence.

Twitter as historical evidence

In recent OED updates the Twitter-associated senses of tweet (as a verb and a noun) and the word hashtag have been added. (The separate Oxford Dictionaries site has more, including retweet and tweetup.)

As with all OED entries, the definitions of these new words and senses were based on actual usage: the OED is commonly described as descriptive as opposed to prescriptive because its entries are based on the available evidence of how words are used, rather than on an editor’s sense of how something should be used.

Moreover, the OED is historical in nature. Unlike many dictionaries, which record only current meanings, the OED traces back the history of a word in English as far back into the past as possible — sometimes for over one thousand years, but for tweet and hashtag less than ten. (The main difference between the OED and Oxford Dictionaries is that the former is historical and the latter current.)

In adding these new words and senses, then, OED editors then have tried to find the earliest evidence they can — which, unsurprisingly in this case, came from Twitter itself. So, as of the moment, the current earliest evidence for hashtag comes from this post:

I support the hash tag convention: #hashtag #factoryjoe #twitter

— Stowe Boyd (@stoweboyd) August 25, 2007

For the verb of tweet, these tweets are the current earliest evidence for its transitive (with object) and intransitive (no object) senses:

is happy that he’ll be able to tweet more often with Twitterrific. ☺

— Louie (@mantia) January 15, 2007

And, for the noun:

Digital decay

You’ll notice that for two of the four tweets above I was able to link to the original tweets directly; for the other two I have had to use screenshots from the OED’s quotation paragraphs because, as far as I can tell, the Twitter posts have apparently been deleted since the entries were edited.

This exemplifies a significant problem with using web-based sources as evidence: there is no guarantee that what is online one day will still be online the next (hence the concept of link rot).

This is why the quotations above are marked ‘OED Archive’: antiquated as it may seem, we still store printouts of all such quotations to make sure that the evidence on which the OED is built is not lost over time. (Projects such as the Library of Congress’s Twitter collection may at some point obviate this issue for Twitter.)

Despite this problem, in some ways Twitter posts are a good potential source of evidence for the OED. Unlike other sources on the web, Tweets cannot be edited after they are posted, and they are also permanently timestamped: as a historical dictionary, it is important that we can reliably know that a particular usage occurred at a specific time.

With other websites, blogs, etc., dating can be more of a problem. Publication dates are often unavailable or unverifiable, and edits can conceivably be made silently at any time, which does not make for reliable evidence. (Archival sites like the Wayback Machine are useful, but not a complete solution.)

Young media

What is also notable about the quotations above is that they are very self-referential: they are evidence for terms relating to Twitter, taken from Twitter itself. However, in OED terms Twitter is still a very young medium, having launched only in 2006, so it seems likely that as time goes on more early uses of words will be found on the site, independent of the site itself.

Twitter is by no means the only digital source of OED quotations. Usenet newsgroups provide the earliest evidence for such well-known terms as email address, computer geek, and bum cleavage, and quotations from blogs and other web pages are the first for entries such as blog and selfie.

There is even an instance of an email forming the earliest evidence for something: the first quotation at lashed (meaning ‘drunk’) is from a 1996 email sent by one OED editor to another, pointing out this new sense they’ve heard is popular with students.

The informal or spontaneous language used by people on Twitter, and online generally, is something very different from the more formal, mediated version of English typically used in books and newspapers: as such, it is a welcome addition to the OED.

For more language articles by the Oxford Dictionaries team, please visit the OxfordWords blog.
Like what you read? Give Oxford Academic a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.