The evolution of languages

Razib Khan
6 min readMar 18, 2019

--

Map of language families of the world today

The story in the Bible about the “Tower of Babel” was the explanation that the ancient Hebrews gave for why there was so much linguistic diversity in the world around them. Ancient people were curious and observant enough to notice that their neighbors did not speak like them. The word “barbarian” comes from the ancient Greek perception of what non-Greeks sounded like to Greeks.

Sometimes linguistic differences can be more subtle, but still critical to life and death. The meaning of the term shibboleth comes out of the context where different ancient Israelite groups pronounced s differently and used that to identify members of an enemy tribe. The limits of your language are often the limits of your tribe.

But evolutionary genetics tells us humans share a common ancestor. That we are one tribe in our genealogy. In fact, the most recent common ancestors of all human populations lived within the last 200,000 years. Outside of Africa, they lived within the last 50,000 years. And, in North and South America it is within the last 15,000 years. We are a young species.

And yet you have a situation such as in the highlands of New Guinea where people who live in different valleys positioned next to each other speak two totally different languages. In North America, Europeans encountered thousands of languages and many language families. And yet we know that most of the ancestors of the natives of North America arrived within the last 15,000 years!

The situation in the Americas may have been the norm in the recent past. Today 40% of the world’s population speak Indo-European languages, but 6,000 years ago it is likely that very few Europeans or Indians spoke Indo-European languages. The spread of English, Arabic, and Chinese occurred in historical time. Their rise to dominance is due to social and political realities of the last 2,000 years.

The ancient world points to incredible linguistic diversity which faded with rising of the “empires of the word.” Over four thousand years ago in Mesopotamia, what is now modern Iraq, many of the people spoke Semitic dialects. Related Arabic and Hebrew. But Sumerian flourished at the time in the south, a language unrelated to any we know of today. In the far north, the people spoke Hurrian, again, a language unrelated to any which flourish today. In the mountains to the east there lived the Guti and Kassites, who seem to have spoken languages unrelated to any spoken today as well.

Etruscans spoke a non-Indo-European language, but influenced the Romans

The Romans record the presence of Etruscans, who influenced their culture, and spoke a language which was not Indo-European. To the further north, there were Ligurians, hugging the coast around modern Genoa, while in the hills there lived tribal Samnites and Oscans. To the south, there were Greek cities and obscure native peoples such as the Sicils. The island of Sardinia was inhabited by speakers of what we now term “Paleo-Sardinian,” perhaps related to Basque. The ancient world was one great Babel.

What this highlights is that while genetic evolution proceeds slowly, gradually, and continuously, linguistic evolution can be riotous, rapid, and proliferate at light speed toward unintelligibility.

Just by physical inspection, one can tell that Finns and Swedes share common ancestors. That they are genetically related. But linguistically they are as different as can be. Finnish is no closer to Swedish phylogenetically than it is to Bantu or Chinese! Swedish as a language is most definitely closer to Bengali, Spanish, or ancient Hittite, than it is to Finnish.

Evolution simply describes a change in characteristics which be defined on a phylogenetic tree. This can be biological, as with genetic evolution, or, it can be cultural. But clearly, the mechanisms matter here. Mendel’s laws impose constraints and regularity to biological evolution which culture lacks. Half of your genetic material comes from each parent. There is no such constraint with culture. In fact, your cultural inheritance may come from someone who is not your biological parent.

Whereas genetic evolution can be traced through modern scientific methods to billions of years in the past, elements of cultural evolution shift so fast that most researchers are skeptical of the possibility of going more than ten thousand years in the past. We have a Neanderthal genome, but it is unlikely we will ever be able to reconstruct the Neanderthal languages (there were certainly many!).

The diversity of languages of North and South America illustrates how a small number of people, perhaps a few thousand genetically, can give rise to thousands of languages hundreds of generations later. The diversity we see around us today in the modern era is but a shadow of what was likely the human norm for most of our species’ history. It is as if a massive process of selection has winnowed down the languages spoken down to a few huge families.

And yet we can still discern similarities across many languages separated by history and large geographical distances. This is most famously illustrated by the “Indo-European” languages.

The affinities between Indian languages and those of Europe were discerned by Sir William Jones in the 18th century. After the fact, the similarities are clear to native speakers. A focus on core words that were more likely to be preserved gave rise to “Swadesh lists.”

Here is the number “nine” in various languages:

Finnish: yhdeksän, Hungarian: szám, Basque: zenbakia, Swedish: nio, Czech: neun, French: neuf, Armenian: inn, Bengali: naẏa, Arabic: tis3a, Turkish: dokuz

Even if you are not a linguist or philologist peculiar similarities may jump out at you (as well as discordances). This is because a large number of languages in that list are Indo-European, and share a common origin within the last ~5,000 years. Paired with them are nearby languages which are non-Indo-European.

It is almost certainly the case that most of those languages above are spoken by people who share ancestors within the last ~50,000 years…but evolution on vocabulary is fast enough that the signal of shared ancestry is lost much faster than in genetic evolution.

This is why many historical linguists focus on grammar, rather than vocabulary. Just going by a list of the number of words within the lexicon you might conclude that English is a Latinate language, like French, Spanish or Italian. But if you look at grammar, it is clear that English is a Germanic language. Vocabulary is something that is easily shared, and quite protean. Consider how quickly different generations develop their own slang and preferred terms.

Grammar is much more conservative, and non-standard speech is often indicative that someone learned English as an adult, and retained the grammar of the language in which they were raised.

Vocabulary evolves fast and responds to selection. People who live in a forested environment may have many ways to describe types of trees. Those who live on a grassland may not. But grammar is part of the deep structure of any language and is evolutionarily conserved. If Noam Chomsky is correct, all grammar is a local expression of “universal grammar,” which is hardwired into our species on the deepest levels.

And yet all of this fascinating research and knowledge is constrained by the fact that most of the world’s languages are disappearing. This mass extinction is happening due to globalization, trade, and the advantages of speaking an ‘international’ language. Of the world’s 7,000 living languages, nearly half are in danger of going extinct.

With the extinction of a language, a peoples’ whole memory fades into oblivion, as well as the record of human diversity from which we can make inferences about the power and range of evolutionary processes in culture.

--

--