The following is a brief non-scientific, very eclectic outline of what I consider the most promising ideas in the evolution of language . It is rather like a hypothetical ride in a time machine taking you back further and further in language evolution until reaching the beginnings of human language. Dedicated to my wife and son.
The creative ape: Homo sapiens – novelty being sexy 90ka ago
The field of language evolution or “evolutionary linguistics” is still largely speculative and marked by a lot of controversies. Some twenty years ago when I started getting interested in language evolution this topic was mostly considered taboo for serious linguists and the most common assumption in academia about language evolution were the language was hardly older than 50.000 years as it was considered the driving force behind the Upper Palaeolithic Revolution (proliferation of new technologies, art and ornaments), in particular cave paintings were considered to represent symbolic communication tied to symbolic language (even though the resemblance between painting and language is anything but a strong one).
What is more, language was not considered to have been shaped by natural selection as natural selection was not considered “powerful” enough and regarded as a by-product or “spandrel” of a large brain or complex thought processes. In particular Noam Chomsky’s, the leading linguist at that time, position on language evolution was rather puzzling to biologists: he postulated an innate language acquisition device, which had however not been brought about by natural evolution. As any biologist knows simple features can occur simply by chance (e.g. genetic drift) whereas complex ones must have a strong driving force behind them, generally known as “natural selection” as Pinker and Bloom pointed out in Natural language and natural selection (1990).
One problem with the 50.000 year old story is that human language is universal it must be as old as the most recent common ancestors of all humans, which is currently estimated 60ka-90ka (“y-chromosomal Adam”). Despite new findings such as the recent discovery in genetic research that Neanderthals shared the FOXP2 gene, the only gene so for identified being involved in human speech, with humans, many scholars cling to a 50.000 year version in which full blown human language does not appear until at least 90.000 years ago.
Indeed, there is good reason to believe that the extreme complexity of human language is a rather recent phenomenon. It has a lot of common with the “peacock’s tail” in being “superfluous plumage” (e.g. the existence of up to dozens of synonyms for a word). This indicates that human language has to a considerable degree been shaped by sexual selection rather than natural selection once it had come into existence, in particular its more creative and entertaining features; cf. Geoffrey Miller: The Mating Mind (2001).
The entertainment ape: Neanderthals – singers and jokers 200ka ago
Pinpointing language evolution to 90ka is, however not less problematic than to 50ky. Anatomically modern humans evolved around 200.000 years in Africa and it seems somewhat strange that they should have waited for more than 100.000 years to start speaking. Furthermore, there is the above mentioned recent discovery that Neanderthals already had the same version of the FOXP2 gene as humans, which does not necessarily mean that they had full blown human language, but it definitely suggests that they did have linguistic abilities. How far developed those where, is of course a matter of speculation. Philip Lieberman already suggested in the 1980s that Neanderthals had reduced speech abilities due to a supposedly high position of their larynx. This idea, however, has not stood the test of time. One hypothesis that is still around, however is that the demise of the Neanderthals was due to superior human language abilities.
Using a time machine and going back 100ka you would most probably find Neanderthals sitting around a campfire, cooking and eating meat, chatting about the latest gossip, as well as joking and singing for entertainment, not too different from modern Homo sapiens. There is some archaeological evidence that Neanderthals did even make musical instruments such as flutes and drums. Steven Mithen proposes in The singing Neanderthals (2005) that Neanderthals used music as a primitive form of communication. It is, however, more likely that they had both music and language with music mostly in the “entertainment function” it has in humans. Assuming that
Homo neanderthalensis shared at least moderate linguistic skills with humans, one has to push back human language to the most recent common ancestor, which we share according to current genetic estimates around 800ka, risking very little in assuming that a primitive protolanguage had already been in place by 1 million years ago in Homo erectus.
The gossiping ape: Homo erectus chatting away in the savannah 1ma ago
Pushing back human language from 50ka to 1ma might seem a rather bold and unfounded step. As language is a cognitive function we might expect it to correlate with brain size. In this respect Homo erectus (~900cc cranial capacity) is much closer to Homo sapiens (~1400cc) than the australopithecines which are very close to chimpanzees (~450cc). Even adjusting for body size we can hypothesizes that the cognitive capacity of Homo erectus went far beyond those of the australopithecines.
However, in evolutionary reasoning the most important factor are the so called “selective pressures”. Without selective pressures there is simply no sense in speculating about language origins. It is a common mistake to assume human language evolved simply because it is useful. On the contrary, Krebs and Dawkins showed in “Animal signals: mind-reading and manipulation”(1984) that communication has to benefit the sender and not the receiver of information in order to arise. Otherwise communication cannot be an evolutionarily stable strategy. This serious constraint on the evolution of communication rather the lack of complex nervous systems is probably the reason for the absence of complex communication systems in the animal world. Bees and ants, despite having rather primitive nervous systems can overcome this constraint because they are very closely related to each other and hyper-social.
One of the most common assumptions in evolutionary linguistics has been that the need for hunting and foraging represented the most important selective pressure in language evolution given the “obvious” usefulness of communication for these pursuits (like in social insects). However, no social hunting animals make use of complex communication when hunting and get by without it perfectly (e.g. wolves). During the last 20 years the attention has shifted more and more towards human sociability in language evolution. Robin Dunbar discusses the correlation between group size and brain size in primates in “Neocortex size as a constraint on group size in primates” (1992) and extrapolates a group size of about 150 individuals for human beings. His idea is that at some stage in human evolution language must have superseded grooming as the main means of socialization in primates as grooming becomes a less and less economic way of making friends. With language it became possible to groom more than one individual at the time. Given Homo erectus enormous brain size it does not seem unlikely that this primate engaged in something which manifests itself in modern humans as sipping coffee and gossiping with friends (the so-called “gossiping hypothesis”).
Amusing and odd as it may sound, Dunbar’s hypothesis provides one of the very few evolutionarily sound selective pressures proposed so far in the evolution of language. Working out the details may still be a long way off. A rough sketch might look something like this:
About 2.5ma ago the Earth’s climate became drier and humans found themselves in a savannah habitat. This habitat often fosters big groups for increased social protection as the absence of trees, etc. makes the individual more exposed to predators. Thus, savannah living animals usually live in larger groups than their relatives (e.g. baboons, which live in groups of 50+ individuals, considerably larger than those of chimpanzees or gorillas).
Living in large groups has disadvantages as well. Each individual has to make sure not to end up at the bottom of the hierarchy and large groups are therefore a strong selective pressure for brain size and intelligence. Animals in large groups moreover tend to be the most communicative ones. Once a critical group size was passed grooming would not have been a viable means of communication between friends.
Hominids were driven to increasingly larger groups, requiring larger brains and more complex communicative skills. Larger brains themselves made earlier births necessary which prolonged infant dependency on parenting and probably led to monogamous pair ponding for the sake of the offspring`s survival. Monogamous pair bonding in large groups is rather unique in humans and evolutionarily rather unstable and probably required further intelligence and communicative skills as anthropologist Terrence Deacon noted in The Symbolic Species (1997). Monogamous pair bonding, thus, might have had a catalytic effect on language evolution.
The singing ape: Australopithecus – lullabies on the beach 4 ma ago
This might seem the end of the story. Speculating beyond 2.5ma certainly must be meaningless as we probably will never know much about that period with any degree of certainty. What is more, Australopithecus does not seem to be promising candidate for any kind of novel complex form of communication at the first glance as its brain size is very much the same of chimpanzees. The incipient stages of language evolution might just not go further back than 2.5ma.
There are several interesting hypothesis about the incipient stages of language with the two most prominent being the “gestural” and the “musical” hypothesis. The gestural hypothesis does have some rather strong support from neuroscience since the discovery of “mirror neurons” for language production and has probably become the favored one among researchers. The musical hypothesis was first proposed by Charles Darwin himself in the Descent of Man and is based on the observation that language and music are intertwined (intonation, tones, paralinguistic vocalizations) and has had a couple of revivals since then. One of these is by Dean Falk who is the only one to my knowledge who also provides selective pressures in Finding Our Tongues: Mothers, Infants, and the Origins of Language (2009). Departing from the observation that baby directed talk (motherese) resembles singing more than talking as well as the universality of lullabies in human culture she hypothesizes that singing or humming was an important means of communication (staying in touch) between mothers and infants once humans had lost their hair. Chimpanzee babies cling to the mother`s fur and stay with her all the time. This is not possible for human babies. So mothers would have indicated their presence with reassuring sounds when they put them down. Babies typically fall asleep quickly when they wake up their parents with their crying (asking: are you here to protect me?) upon hearing their parents` soft, humming voice. Dean`s hypothesis evokes one of evolutionary theory’s strongest selective forces: kin selection, which has contributed strongly e.g. to penguin vocalizations.
If Dean`s hypothesis is correct, a lot hinges on the timing of human hair loss. Dean herself like the majority of anthropologist assumes that this did not happen before the genus human (Homo habilis). The most common explanation for human hair loss has been thermoregulation in a savannah environment (even though there are no other animals that would have ever lost their hair in the savannah because of heat). In recent years it has become clear that the environment of early hominid was anything but savannah. It was most likely flooded rainforests. This discovery lends support to one of the most discredited hypothesis in evolution: Elaine Morgan’s aquatic ape hypothesis. According to Morgan many of humans’ exceptional features can be explained by a semi-aquatic environment: erect gait with wading through water. More importantly here Morgan associates hairlessness with water living (compare elephants and mammoths, water buffalos to buffalos), which means that selective pressures on mother-infant communication would have started to work as early as 5ma ago. Australopithecines have surprised anthropologist once when it was discovered that they were biped, I would not be surprised that they will do so again and turn out hairless as well as musical. One interesting detail might be noted here: Morgan also links voluntary breathing with an aquatic environment and the consequent steps of voluntary voice control. Proto-music might have been only a tiny step away.
In conclusion I summarize the selective forces for the individual stages of language evolution.
Kin selection > mother infant communication (proto-musical): Australopithecines (or earlier!)
Social selection > social grooming function (gestural, vocal, verbal) Homo (habilis?,erectus)
Sexual selection > being selected by an attractive mate (hyper-verbal) Homo sapiens
Andreas Hofer (23.01.2011)