A (Very) Brief History of Natural Language and Its Processing

Paul Kruger
Nov 6 · 3 min read

As Natural Language Processing (NLP) is now ubiquitous in our digitally-dominated daily lives, let take a breezy journey through the history of “natural language” and the efforts to scientifically process it:

~100,000 B.C. The birth of natural human language, at least, the current best estimated date. Due to the ephemeral and elusive nature of the spoken word (especially in the days before written languages and recording devices), no one can be exactly sure when it began. The prevailing scientific opinion is that it came into existence with (or soon after) the emergence of Homo sapiens as a species.

~3200 B.C. Cuneiform, the first known written language is used in Mesopotamia.

~700 B.C. The birth of Latin, probably the most influential language in history, especially in the context of the Indo-European branch of languages which are currently spoken by 46% of the world.

~900 A.D. Latin has been replaced by the Romance languages that descended from it and is no longer spoken amongst the general population, rendering it a “dead” language. R.I.P., old friend (Requiesce in pace).

1906 Ferdinand de Saussure, a linguistic professor at the University of Geneva, introduces the concept of “Language as a Science” and begins teaching courses about languages as “systems.”

1950 Alan Turing publishes his ground-breaking paper explaining the idea of what would come to be known as the “Turing test” in computer science. Specifically, Turing proposed that if a machine could carry on a conversation with a person through the use of a printer, and it mimicked a human without any noticeable differences, then the machine should be considered capable of thinking.

1954 Scientists from Georgetown University and IBM collaborate to successfully translate sixty sentences from Russian to English using machine learning. Presumably filled with an abundance of post-war American optimism, the researchers stated that machine translation would be a solved problem within five years. Even with all the incredible progress made in NLP since then, today I don’t think anyone would consider machine translation “solved.”

1957 Noam Chomsky publishes his book, Syntactic Structures, theorizing that for a computer to understand natural language, a new grammar structure would be necessary. Chomsky created a style of grammar called Phase-Structure Grammar, which systematically translated natural language sentences into a format usable by computers, with the goal of achieving artificial intelligence.

1980s The decade sees a great leap forward for Machine Learning, coinciding with a shift to statistical models rather than “hand-written” rule-based ones.

2001 The introduction if the first neural “language” model, using a feed-forward neural network. This form of neural network pushes the data in only one direction, with no cycles or loops, and varies considerably from recurrent neural networks.

2011 Apple’s launches the world’s first NLP/AI assistant to be widely used by general consumers (Siri). Its Automated Speech Recognition module translates the owner’s words into digitally interpreted concepts.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade