The Origin of Language

Musing and Speculations

David Rosson
Linguistic Curiosities
6 min readFeb 18, 2019

--

Millennia of Silence

November 7, 2012

It may be, that internal language (FLN) evolved first, which enhanced cognition and thereby fitness; it was then selected, and exapted, for interpersonal use. For millennia, language was the silent imagination in a person’s head.

December 23, 2012

For something to be an adaptation:

  • It must correspond to phenotypically expressed traits;
  • It must result in reproductive advantage.

I can imagine three explanations:

1. Cognitive: the ability to conceptualise and model the world with abstraction and grammar gives the individual an advantage in navigating the natural and social environment.

  • Silent: this explains how it could have been incremental, that even the primitive versions of the grammar faculty enhanced cognition.
  • Vocalised: later it evolved to become more “expressed” and outward, and at one point leapt onto symbolic transmission.

2. Sexual: for both “pro-sociality” and “brain on display” (G. Miller).

3. Group: selection by indirect fitness, in two ways:

  • Cooperative: cooperative breeding (“it takes a village”), pro-sociality, “pushing the grey ceiling”, intentional teaching, precursor to culture.
  • Competitive: groups that were better coordinated supplanted the others (how language is deeply intertwined with sentiments and in-group/out-group differentiation).

There are more complex interactions in the co-evolution of genetics and culture, and the hot shot areas of “evo-devo-evo”; and retracing the phylogenetic history of the lowering of the larynx (at the risk of choking), lateralisation, the emergence of developmental stages, etc.

October 3, 2014

A summary based on Arbib’s talk.

1.1 Extracting meaning: example of visual processing, from edge detection to thematic analysis — feature extraction and contextual probabilities — snapped onto a schema of recognition.

1.2 Central coherence: from features to themes, with flexibility and tolerance for variations and noise => robust reduction.

1.3 Abstract representations: ability to generalise => robust induction.

2. The repertoire of manual operations: “reach -> grip -> retrieve” => a mental store of available options: sequential actions towards proximal and ultimate goals. See: Alstermark et al. (1981).

3. Mirror neurons: registering operations without performing them, i.e. a mental representation of actions/movement/gestures in others.

4. Implications for fitness: imitation, transmission of skills; competitive advantage in anticipating others’ moves; empathy or theory of mind.

5. Ritualisation: the evolution and emergence of bodily signals — the ability to achieve a function (e.g. determine hierarchy) without performing the full sequence of available actions (e.g. fighting to death).

6. Now the picture is almost complete:

  • Linking actions to meaning -> performable actions serving a goal.
  • Registering actions (gestures) -> mirrored recognition.
  • From meaning to gesture -> ritualisation.
  • Robustness in recognition -> allows abstraction.

7. Now the gesture or symbol referring to a meaning or idea can be far removed from the original sequence.

For example, when you pull out your smartphone, and “dial” a number by touching the screen. The gestures with which you communicate with the computer are really many steps away from the etymology. There is no dial and you are not really dialling anything — except you are performing an action signified by such a word.

And that is essentially what lexis allows you to do: representing ideas using abstract symbols that are far removed from the original action sequence or quality or thing or even its associated pantomimes.

But the above only goes to the level of bonobos on lexigrams, that’s only about one third of the story. The second step is to explain how speech is basically “audible gestures”, and how a combinatorial encoding system takes over — along the expansion of lexicon (Acredolo & Goodwyn, 1985; Capirci et al., 1996; Butcher, 2000; Iverson & Goldin-Meadow, 2005) where it goes from one-word to one-word-plus (gesture) to two-word. See also:

Anisfeld, M., Rosenberg, E. S., Hoberman, M. J., & Gasparini, D. (1998). Lexical acceleration coincides with the onset of combinatorial speech. First Language, 18(53), 165–184.

Then the third part is explaining the emergence of generative grammar… a rule-based system for planning and executing sequences. Perhaps see:

Fitch, W. T. (2011). The evolution of syntax: an exaptationist perspective. Frontiers in evolutionary neuroscience, 3.

[Video]: in slow-motion, you can see the cat modifying the “tactical positioning” of its footholds as well as various “action components” with high precision in executing a well-coordinated leap sequence.

[Continued]

Saying “Ahhh” can be just another gesture, it’s no more “removed” or abstract than clapping hands (a gesture that happens to be audible too) — only that you are “clapping” your vocal folds to make the sound.

Consider these “units of meaning” with no sonorant components and seemingly non-confirmative to how English phonology would define a word:

  • “Psst!”
  • “Pff…”
  • “Tsk tsk…”
  • “Shhh!”

They are closer to “audible gestures” than to lexical items with a re-combinatorial encoding scheme (that is, made up by combining and rearranging phonemes).

This difference in-between (or threshold) is what I alluded to as the “switch” from referential gestures to linguistic phonology. I have two speculations about this:

1. This “phonology module” — though this module may be psycholinguistically but not neurologically real i.e. it’s an interplay of various exapted (rather than de novo) sub-systems, as Arbib would say — emerged at some point of the evolutionary course. And it gave its bearers (our common ancestors) an advantage because the vastly expanded lexical capacity of a combinatorial system.

2. This “module” matures along some point of the developmental course, roughly corresponding to the sharp inflection point you see in the vocabulary curve. The child would move from controlled gestures and gesture-like utterances to multiple gestures and expanded one-word vocabulary and coordinated word-plus-gesture uses, and eventually to a switch onto a phonologically based model.

February 26, 2019

Reading list:

MacNeilage, P. F. (1998). The frame/content theory of evolution of speech production. Behavioral and brain sciences, 21(4), 499–511.

November 27, 2014

Phylogenetic Components of Language

  • Ritualised reference (lexicon)
  • Combinatorial encoding (phonology)
  • Central coherence (semantics: Gestalt meaning)
  • Procedural coordination (syntax)

--

--