The Evolution of the Lullaby - The Pleasures of the Sensory World - The Pleasure Instinct: Why We Crave Adventure, Chocolate, Pheromones, and Music - Gene Wallenstein

The Pleasure Instinct: Why We Crave Adventure, Chocolate, Pheromones, and Music - Gene Wallenstein (2008)

Part II. The Pleasures of the Sensory World

Chapter 7. The Evolution of the Lullaby

Is it not strange that sheep’s guts should hale souls out of men’s bodies?

—William Shakespeare, Much Ado about Nothing

As neither the enjoyment nor the capacity of producing musical notes are faculties of the least use to man in reference to daily habits of life, they must be ranked among the most mysterious with which he is endowed.

—Charles Darwin, The Descent of Man

Several years ago I volunteered at a behavioral clinic and worked with a group of fourteen adolescents who were diagnosed with attention-deficit-hyperactivity disorder (ADHD). It was an amazing thing to see such large differences in symptoms among teens with the same diagnosis and even within the same individual from day to day. We had our share of students who were chock full of what most would consider normal childhood energy, along with others who were clearly different in that their activity levels seemed unending in practically every context.

At our clinical staff meeting one Monday morning, a new intern raised the possibility of adding music therapy to our group sessions. While some of us, I’m sure, were picturing a guitar and perhaps a few harmonicas, she went on to tell us about her friend who teaches African drumming. She argued passionately that kids with ADHD respond well to drum sessions because they promote group cooperation and turn-taking.Within two weeks, we had fifteen beautiful instruments—ten kpanlogos, with their warm, earthy bass tones, and five djembes, offering a high, snappy timbre.

The drums came complete with a colorful music therapist fresh out of UC Berkeley, who visited us every Thursday afternoon. Joachin began a typical drum session by gathering all the students into a circle and starting a very simple rhythm consisting of one beat sounded roughly every second—a “heartbeat.” In the first few weeks, just getting all fourteen students to sit in their chairs at the same time was a genuine accomplishment, but by the end of the first month they began to look forward to each session, and the changes in our little community were palpable. At some point during the fifth session, I remember feeling a deep sense of pride at how fast our circle had joined into a synchronized beat that particular day.

Joachin gradually introduced more complicated rhythms that were to be played on top of the heartbeat. Each student had a rhythm to maintain in concert with the entire circle—some had to play menjani, while others had to play aconcon or something else, and this varied from session to session.Above all, there was no room for solo performances in the circle, and whenever the group’s overall rhythm broke down, we would start over amid sighs of frustration.Toward the end of the third month a noticeable improvement in our collective sound became obvious.The sessions began to take on a true communal feel, and at times the group’s two or three main rhythms were so synchronized that the sound became almost hypnotic. During these periods I often lost myself in the moment, imagining the millions of other drum circles that have played across time and culture—humans of all kinds joining and celebrating nature’s periodicity. One minute I’m drumming with students in a therapeutic residence, the next I’m part of a tribal clan of early hominids living along the Rift Valley. Perhaps we’re drumming in preparation for a hunt or to celebrate the arrival of a newborn or a marriage.

Ancient drums have been discovered in almost every part of the world. Their earliest appearance in the archaeological record dates back to about 6000 B.C., excavated from Neolithic Era sites in northern Africa, the Middle East, and South America. Ceremonial drums have been found in these regions, along with wall markings depicting their use in various aspects of social and religious life. Other percussive and even flute-like instruments have been unearthed at Homo sapiens sites throughout Europe and Asia dating as far back as a hundred thousand years. And the music wasn’t limited to modern humans—it appears that Neanderthals made music as well. Archaeologists excavating a cave near Idrija in northwestern Slovenia recently found a bear’s polished thighbone with four artificial holes drilled into it that were aligned in a straight line on one side. Although we don’t have any way of knowing if this sixty-thousand-year-old object was ever used to make sounds or even music, very similar “bone flutes” have been discovered at Homo sapiens sites and estimated to be forty thousand to eighty thousand years old.

The rule is very simple in modern cultures across the world and throughout all of recorded history: wherever there are humans, there is music. No recorded human culture—whether extinct or extant—has ever been without music production. Although what passed for a melody in ancient China undoubtedly differs from, say, what a twenty-first-century European might find entertaining, all humans have a faculty for producing and enjoying music. Indeed, given the omnipresence of music production and enjoyment across human civilizations, some researchers consider musicality to be an evolutionary adaptation, perhaps akin to language. But unlike language, which is used to communicate our thoughts to others, music has no clear-cut survival or reproductive consequences. So the question remains:What is the adaptive function of music?

There are generally three schools of thought on the origins of music. The first group views music as an interesting, albeit evolutionarily irrelevant artifact of our sophisticated brains—a form of “auditory cheesecake.” The basic idea is that humans evolved a set of sophisticated cognitive, motor, and perceptual skills that have clear survival and/or reproductive value, and the expression of these skills led naturally to the emergence of other by-product abilities, such as art appreciation and musicality. The cheesecake view is overwhelmingly the most popular in mainstream psychological thinking today.

A second view is that music has real survival value and has been forged by the same principles of natural selection that have shaped other cognitive abilities such as binocularity, color vision, sound localization, and so forth. A wide range of suggestions for the function of music has been made, most having to do with its ability to bond the social group through coordinating action and ritual. Undoubtedly, music can have a profound influence on the behaviors and emotions of large groups of people—if you need to be convinced, simply visit a local nightclub or ballpark. This view, however, has its problems because it depends on the rather untenable position of invoking group selection to account for the evolution of musicality—a mechanism that has never proven convincing to scholars of mammalian evolution.

Finally, a third school of thought argues that music evolved primarily through mechanisms of sexual selection rather than natural selection. The chief difference between the two, of course, is that natural selection fosters adaptations that increase an organism’s likelihood of survival, while sexual selection fosters adaptations that increase the likelihood of successful mating and reproduction. Both mechanisms impact the ultimate scorecard of evolution—how well an organism passes on its genes—yet the adaptations that emerge can often be at odds with one another. For instance, the size and color vibrancy of the peacock’s tail is an important variable in reproductive success. Peahens are attracted to males with the largest and most colorful displays. At the same time, however, this conspicuousness puts “handsome” peacocks at a survival disadvantage from a natural selection viewpoint because they are easier to spot by predators, and vibrant, cumbersome tails make them less able to evade an attack. Indeed, adaptations driven by sexual selection often emerge because they handicap an organism’s survival in some way that makes it easier to assess its true fitness. In this example, the fitness cost of having a large and colorful tail makes the peacock an easy target. Those males who have the most conspicuous tails are truly the fittest—the thinking goes—because they can afford both the metabolic cost of growing a large tail and the survival cost associated with attracting the attention of predators. In this context, music is seen as just another ornate animal display designed to get the attention of the opposite sex. The proponents of this view argue that music production is a reliable fitness indicator because it signals an ability to maintain a high degree of skill at the cost of diverting energy, attention, and time away from basic survival behaviors.

Each of these perspectives on the origins of human musicality is based on distinct mechanisms and therefore has unique implications for our relationship with music and why we find it pleasurable. In this chapter I will offer a fourth perspective: that our attraction to music results from a developmental requirement that we experience distinct classes of auditory stimulation for normal brain growth and maturation throughout life but particularly during the first two decades. We will find that there are innate constraints on musical sensitivity that transcend cultural differences and provide a core set of features common to all styles and genres. These features have a great deal in common with the singsong of motherese and will offer clues as to why we find pleasure in music as well as many other types of acoustic experiences.

A Universal Grammar

Music is said to be the universal language, but exactly what properties, if any, can be found that transcend culture, geography, and time? Most people prefer the musical genre they grew up with, and even the most casual observer must concede that there is tremendous variation in style from generation to generation. Clearly, learning plays a large role in shaping the specific musical idioms we prefer. Research throughout the past decade, however, has begun to show that certain sounds and note combinations have virtually universal effect on the emotions of listeners independent of the culture in which they were born, raised, and live. Moreover, most neurologically normal listeners, no matter where they are from, can agree on what is and is not musical, even when the sequence of tones is novel or drawn from a foreign scale. This has led some theorists to focus on the similarities between music and language development when speculating on the origins of musicality.

Decades ago, the linguist Noam Chomsky set out to understand why all normal children spontaneously speak and understand complex language. He pointed out that all mature speakers of a language can generate and interpret an infinite number of sentences, despite great variation in their levels of formal education. Moreover, in any given language, most native speakers can agree on whether a sentence seems grammatical. Since most speakers have these abilities despite varying levels of formal linguistic training, Chomsky argued that we are all born with an innate knowledge of language. The instinctual set of rules we unconsciously use to make grammatical judgments as well as to produce and interpret sentences is called the universal grammar. Chomsky argued that linguistic development involves the fine-tuning of this grammar toward settings appropriate to the indigenous language.

The composer Leonard Bernstein was the first to apply Chomsky’s ideas about language to music. He suggested that all the world’s musical idioms conform to a universal musical grammar. This theory was advanced more formally through the work of psychologist Ray Jackendoff and musicologist Fred Lerdahl. They viewed music as being built from a hierarchy of mental structures, all superimposed on the same sequence of notes and derived from a common set of rules. The discrete notes are the building blocks of a piece and differ in how stable they feel to a listener. Notes that are unstable induce a feeling of tension, while those that are stable create a sensation of finality or being settled. Musical styles differ in the emphasis placed on beat interval and pitch, but most genres use notes of fixed pitch.

Pitch is related to the frequency of the sound wave’s vibration that is emitted by an instrument, but is perceived musically relative to other notes and the interval separating them rather than in any absolute sense. When a guitar string is plucked it vibrates at several frequencies at once: a dominant frequency called the fundamental and integer multiples known as harmonics, which add fullness and timbre. For example, a note with a vibration of 64 times per second will have overtones at 128 cycles per second, 192 cycles per second, 256 cycles per second, and so forth. The lowest frequency—which is often the loudest—determines the pitch we hear. In this example, the fundamental frequency is 64 cycles per second and corresponds to the second C below middle C.

When a sound wave vibrates faster, say at a fundamental frequency of 128 cycles per second, we perceive the tone as being higher. Since the fundamental frequency of this new tone at 128 cycles per second is related to the other tone at 64 cycles per second by an integer multiple (128 = 64 × 2), it will sound higher but with the same pitch (a middle C).The interval that separates our two example tones at 64 and 128 cycles per second is called an octave.All primates perceive tones separated by an octave as having the same pitch quality. The pentatonic scale, common to most musical idioms across the globe, is built from having five distinct pitches within an octave. Throw in two additional pitches per octave and you have the seven-tone diatonic scale that forms the foundation of all Western music, from Beethoven to the Beatles.

Music is governed by a relatively small set of rules—like language—that can be used to generate an infinite variety of compositions. Music also employs recursion. In the same way that a sentence can be lengthened indefinitely by adding modifiers or additional words, so can a musical piece by inserting new or repeating phrasing. And just as language emerges naturally in children without a need for formal linguistic training, so too does music. Indeed, the only requirement for the development of musicality in babies is exposure to music.

As we have seen in earlier chapters, human newborns are far from being blank slates. With regard to the sensations of touch, motion, smell, and taste, they have clear preferences for certain stimulation patterns that are optimally tuned for regulating brain growth and development. The same is true for hearing. Newborns are attracted to music from birth and are sensitive to acoustic properties that are common to all music systems across cultures. By the time an infant is two months old, it will have roughly the same ability to distinguish pitch and timing differences in musical structure as that of listeners with decades of exposure to music. From the very beginning of life, newborns are attracted to specific features of music that are also preferred by adults the world over.

Babies as young as four months old show a stable preference for music containing consonant rather than dissonant intervals (an interval is a sequence of two tones). They also discriminate two melodies apart more easily if both have a consonant interval structure rather than a dissonant structure. Consonant intervals are those where the pitches (the fundamental frequency) of the constituent tones are related by small integer ratios. For example, intervals such as the “perfect fifth,” with a pitch difference of seven semitones, or the “perfect fourth,” with a pitch difference of five semitones, have very small integer ratios of 3:2 and 4:3, respectively. Adult listeners from all cultures find these intervals pleasant-sounding, and babies love them. Both adults and four-month-olds prefer these consonant intervals to dissonant intervals such as the tritone, with a pitch difference of six semitones and a large pitch ratio of 45:32. Infants listen contentedly to melodies composed of consonant intervals but show signs of distress when some of the intervals are replaced by dissonant intervals.This effect has been observed in many cultures and in infants with varying levels of music exposure. Hence it appears to result from an innate predisposition toward certain acoustic features that are pleasurable and indeed seem to be shared by most systems of music.

Another interesting feature of auditory processing that is present in infants is their ability to detect transpositions of diatonic melodies across pitch and tempo. Both infants and adults can recognize a tune based on the diatonic scale as the same when it is transposed across pitch, but fail to do so when the melody comes from a nondiatonic scale. Since all primates perceive tones separated by an equal octave as having the same pitch quality, one might predict that the ability to detect transposition of diatonic melodies is also present in our hairier cousins.To date, this experiment has only been performed with rhesus monkeys and, as expected, they exhibit the same effect as human adults and infants. The presence of similar auditory preferences and perceptual abilities among adult listeners and infants from different cultures suggests that certain features that are critical components of music competence exist at birth.


That certain auditory biases exist at birth is probably not news to parents. Even those who are uninitiated to this phenomenon learn quickly that their prelinguistic newborn is a capable communicator. Infants communicate with emotional expression, and parents use this to gauge what their child needs. Few stimuli calm an infant and get their attention more effectively than the lullaby sung in a soothing voice. As we saw in earlier chapters, infants recognize their mother’s voice from birth, and are calmed when they hear it. Experiments have shown that newborns and infants are highly sensitive to the prosodic cues of speech, which tend to convey the emotional tone of the message. Prosody is exaggerated even more so in the typical singsong style of motherese that dominates parent-infant dialogue during the first year of life.The infant trains its parents in motherese by responding positively to certain acoustic features they provide over others. Motherese and lullabies have so many acoustic properties in common—such as simple pitch contours, broad pitch range, and syllable repetition—that theorists have argued them to be of the same music genre.

Just as motherese shows up with the same acoustic properties in virtually every culture, so too does the lullaby. Practically everyone agrees on what is and is not a lullaby. Naive listeners can distinguish foreign lullabies from nonlullabies that stem from the same culture of origin and use the same tempo. Of course, infants make the distinction quite readily. Even neonates prefer the lullaby rendition of a song to the nonlullaby rendition performed by the same singer. Although it is tempting to attribute such preferences to experience, studies have shown that hearing infants raised by deaf parents who communicate only by sign language show comparable biases. It appears, then, that from our very first breath, we carry a set of inborn predispositions that make us seek out specific auditory stimuli. These stimuli are common across cultures and appear in many forms of music but are exemplified in the lullaby. Why should this be the case? One might argue that these acoustic features help foster mother-infant communication, but this just passes the question along without really answering it. Why do these specific acoustic properties show up in motherese and the lullaby? They arise because the infant trains his or her parents to provide these stimuli through feedback in the form of emotional expressions of approval and calm.The real question is why these types of auditory experiences pacify and bring pleasure to infants (and adults).

In April 2003, scientists from the University of California at San Francisco discovered that newborn rats fail to develop a normal auditory cortex when reared in an environment that consists of continuous white noise. The hallmark of white noise is that it has no structured sound—every sound wave frequency is represented equally. Neurobiologist Michael Merzenich and his student Edward Chang wanted to understand how the environmental noise that we experience every day influences the development of hearing disorders in children. They speculated that perhaps the increase in noise in urban centers over the past several decades might be responsible for the concomitant increase in language impairment and auditory developmental disorders observed in children over the same period.

Their experiment began by raising rats in an environment of continuous white noise that was loud enough to mask any other sound sources, but not loud enough to produce any peripheral damage to the rats’ ears or auditory nerves. After several months, the scientists tested how well the auditory cortex of the rats responded to a variety of sounds. They found significant structural and physiological abnormalities in the auditory cortex of the noise-reared rats when compared to rats raised in a normal acoustic environment. Interestingly, the abnormalities persisted long after the experiment ended, but when the noise-reared rats were later exposed to repetitious and highly structured sounds—such as music—their auditory cortex rewired and they regained most of the anatomical and physiological markers that were observed in normal rats.

This finding created a wave of excitement throughout the scientific community because it clearly showed the importance of experience in influencing normal brain development. The developing auditory cortex of all mammals is an experience-expectant organ, requiring specific acoustic experiences to ensure that it is wired properly. As Chang summarized, “It’s like the brain is waiting for some clearly patterned sounds in order to continue its development. And when it finally gets them, it is heavily influenced by them, even when the animal is physically older.”

The auditory cortex of rats and humans—indeed, all mammals—progresses through a very specific set of timed developmental changes. As we have seen in the other sensory systems, this development depends on genes to program the overall structure, but requires the organism to experience environmentally relevant stimuli at specific times to fine-tune the system and trigger the continued developmental progression. Genes don’t just magically turn on. In most cases they wait for an internal or environmental promoter to trigger their expression. And the details of development are not in the genes but rather in the patterns of gene expression.

The primate auditory system develops a bit differently from the sensory systems of touch, smell, and taste that we have considered thus far. The peripheral anatomical structures of the auditory system begin to form very early in development, yet the system matures rather slowly as a whole. For instance, by the time Kai had been in Melissa’s womb for about four weeks, he already had the beginnings of ears on either side of his embryonic head. Cells were also forming in what will become his cochlea, the shell-shaped organs in each ear that transduce acoustic sound waves into electrical impulses that the brain uses to communicate. By about the twenty-fifth week of gestation, Kai had most auditory brain-stem nuclei in place that will be used to process features of acoustic information such as sound localization and pitch discrimination. But these cells depend on stimulation for continued growth, maturation, and being able to form synaptic connections with their higher cortical target sites.

It is probably no surprise to readers by now that it is precisely at this time—when the brain most needs auditory stimulation—that fetuses begin to hear their first sounds. We know this for two reasons. First, it is at this age that fetuses first show signs of what is called an auditory-evoked potential. Preterm babies are given a battery of tests. Among these is a painless test that involves placing a small headphone over their ears and attaching three electrodes to their scalp to measure their brain’s response to auditory stimuli. When a brief clicking noise is played, preterm babies younger than about twenty-seven weeks show little or no electrical response following the stimulus—their brain is not mature enough to register the sound, and they show no sign of hearing it. It’s not until after twenty-seven weeks or so that preterm infants show the first evidence of a brain response to auditory stimuli, and not so coincidentally, the first signs of actually hearing sounds.

These results are consistent with observations using ultrasound technology to monitor fetal movements in response to tones played on their mother’s stomach. At Kai’s sixteen-week ultrasound, he showed no response to auditory stimulation in the form of tones played near Melissa’s stomach, or either of our voices.The story had changed by his thirty-week ultrasound. Not only did he appear less embryonic, he also altered his movements whenever we made a loud noise.The most reliable change was a complete halt of his ongoing movement when his mother spoke. My paternal observations are consistent with real experiments showing that fetuses start and stop moving in response to auditory stimuli, and even blink their eyes in reaction to loud sounds heard in the womb.

Throughout the last trimester, Kai’s brain was taking in sounds and using them to stabilize and fine-tune his developing auditory system.Although many sounds can pass through to the womb, he was especially sensitive to those that changed with dramatic pitch contours. This is because even fetal brains show adaptation to unchanging stimuli. A tone that is repeatedly played at the same pitch and amplification is responded to fully at first, but becomes less and less interesting over time. This is mirrored by physiological responses measured from the brain such as auditory-evoked potentials. Evoked potentials become smaller and smaller in preterm babies if the same old boring stimulus is played over and over again.The brain simply begins to habituate, and the stimulus becomes less salient.

Continuous and slowly changing sounds—those that exhibit exaggerated pitch contour and wide pitch variation (exactly like those heard in motherese and in lullabies)—keep the baby and its brain in an attendant state. Fetuses show far less behavioral habituation to music that sounds like motherese than to repetitive tones of the same exact pitch. Likewise, preterm infants older than thirty weeks do not exhibit a decline in their auditory-evoked potential if they are stimulated with sounds that change slightly in pitch rather than stay the same. The sounds of motherese and lullabies are born from acoustic features that are the perfect forms of stimulation to ensure that a fetus’s experience-expectant brain will continue to develop normal auditory circuitry and perceptual skills that will help it survive after birth.

The auditory system is not the only part of the brain that benefits from sound stimulation. Research has shown that fetuses older than thirty weeks can distinguish different phonemes such as ba versus bi, suggesting that prenatal experience may be critical to the development of language areas.There is also evidence that auditory stimulation while still in the womb promotes the development of limbic structures such as the hippocampus and the amygdala that support memory and emotional development. Indeed, it is now clear that the sounds a fetus hears in its third trimester can be remembered years later and even influence behavior as late as two years after birth. One researcher, for example, found that infants whose mothers watched a particular soap opera during pregnancy were calmed when they heard the show’s theme song, whereas babies whose mothers did not watch the show had no reaction to the song.

Now that Kai is finally born, he is awash in a sea of acoustic information, but not all of these sounds are novel. He is certainly familiar with his mother’s voice and to a lesser extent my own. Many of the sounds that Melissa experienced in her final trimester were likely heard by Kai, and although most were not repeated enough to consolidate into long-term memories, they undoubtedly had a significant impact on his auditory development thus far. Kai, like all primates, will continue to need auditory stimulation for decades to come. Normal development of auditory circuitry continues well into the late teens, resulting in steady improvement in many functions such as pitch discrimination and sound localization. Mammals that are denied this stimulation suffer from a range of abnormalities. For example, rats that are raised in an acoustic environment with a restricted frequency range are unable to hear outside this range as adults.This impacts their ability to discriminate sounds that have pitch variation that overlaps with this frequency range. Deprivation also disrupts their ability to localize sounds—an impairment that could prove costly if approached by a predator.

The fact that all primates have auditory perceptual skills that are facilitated by diatonic scale structure, while not true for all mammals, gives us a rough idea of when our faculty for music may have emerged in our evolutionary lineage. Some Old World primates may have evolved auditory circuitry that had improved function relative to competing primates—such as increased pitch discrimination and sound localization—that gave them a distinct survival advantage. As we’ve seen with modern experimental studies, the successful development of this circuitry depended on the organism experiencing certain forms of auditory stimulation. Clearly, not all primate species have satisfied this demand in the same way. In hominids, natural selection has forged this adaptation by linking these optimal forms of auditory stimulation to the activation of evolutionarily ancient pleasure circuits that are seen in all mammals. These circuits were most likely an earlier adaptation that fostered reproduction. Natural selection produces incremental change in structure and function that is always built on top of earlier adaptations. Structures are co-opted from others not in a design sense, but through a process that unevenly results in the survival of some genes over others. Hominids’ new fondness for wide swings in pitch variation and loudness in combination with exaggerated intonation may have created the initial conditions that ultimately led to the evolution of musicality and motherese in our species.These human technologies, in turn, became very effective tools in promoting brain development. In the next chapter, we will find that a similar story has occurred for vision.