The Language Instinct: How the Mind Creates Language (27 page)

BOOK: The Language Instinct: How the Mind Creates Language
12.83Mb size Format: txt, pdf, ePub

The air leaves the lungs through the trachea (windpipe), which opens into the larynx (the voice-box, visible on the outside as the Adam’s apple). The larynx is a valve consisting of an opening (the glottis) covered by two flaps of retractable muscular tissue called the vocal folds (they are also called “vocal cords” because of an early anatomist’s error; they are not cords at all). The vocal folds can close off the glottis tightly, sealing the lungs. This is useful when we want to stiffen our upper body, which is a floppy bag of air. Get up from your chair without using your arms; you will feel your larynx tighten. The larynx is also closed off in physiological functions like coughing and defecation. The grunt of the weightlifter or tennis player is a reminder that we use the same organ to seal the lungs and to produce sound.

The vocal folds can also be partly stretched over the glottis to produce a buzz as the air rushes past. This happens because the high-pressure air pushes the vocal folds open, at which point they spring back and get sucked together, closing the glottis until air pressure builds up and pushes them open again, starting a new cycle. Breath is thus broken into a series of puffs of air, which we perceive as a buzz, called “voicing.” You can hear and feel the buzz by making the sounds
ssssssss
, which lacks voicing, and
zzzzzzzz
, which has it.

The frequency of the vocal folds’ opening and closing determines the pitch of the voice. By changing the tension and position of the vocal folds, we can control the frequency and hence the pitch. This is most obvious in humming or singing, but we also change pitch continuously over the course of a sentence, a process called intonation. Normal intonation is what makes natural speech sound different from the speech of robots in old science fiction movies and of the Coneheads on
Saturday Night Live
. Intonation is also controlled in sarcasm, emphasis, and an emotional tone of voice such as anger or cheeriness. In “tone languages” like Chinese, rising or falling tones distinguish certain vowels from others.

Though voicing creates a sound wave with a dominant frequency of vibration, it is not like a tuning fork or a test of the Emergency Broadcasting System, a pure tone with that frequency alone. Voicing is a rich, buzzy sound with many “harmonics.” A male voice is a wave with vibrations not only at 100 cycles per second but also at 200 cps, 300 cps, 400 cps, 500 cps, 600 cps, 700 cps, and so on, all the way up to 4000 cps and beyond. A female voice has vibrations at 200 cps, 400 cps, 600 cps, and so on. The richness of the sound source is crucial—it is the raw material that the rest of the vocal tract sculpts into vowels and consonants.

If for some reason we cannot produce a hum from the larynx, any rich source of sound will do. When we whisper, we spread the vocal folds, causing the air stream to break apart chaotically at the edges of the folds and creating a turbulence or noise that sounds like hissing or radio static. A hissing noise is not a neatly repeating wave consisting of a sequence of harmonics, as we find in the periodic sound of a speaking voice, but a jagged, spiky wave consisting of a hodgepodge of constantly changing frequencies. This mixture, though, is all that the rest of the vocal tract needs for intelligible whispering. Some laryngectomy patients are taught “esophageal speech,” or controlled burping, which provides the necessary noise. Others place a vibrator against their necks. In the 1970s the guitarist Peter Frampton funneled the amplified sound of his electric guitar through a tube into his mouth, allowing him to articulate his twangings. The effect was good for a couple of hit records before he sank into rock-and-roll oblivion.

The richly vibrating air then runs through a gantlet of chambers before leaving the head: the throat or “pharynx” behind the tongue, the mouth region between the tongue and palate, the opening between the lips, and an alternative route to the external world through the nose. Each chamber has a particular length and shape, which affects the sound passing through by the phenomenon called “resonance.” Sounds of different frequencies have different wavelengths (the distance between the crests of the sound wave); higher pitches have shorter wavelengths. A sound wave moving down the length of a tube bounces back when it reaches the opening at the other end. If the length of the tube is a certain fraction of the wavelength of the sound, each reflected wave will reinforce the next incoming one; if it is of a different length, they will interfere with one another. (This is similar to how you get the best effect pushing a child on a swing if you synchronize each push with the top of the arc.) Thus a tube of a particular length amplifies some sound frequencies and filters out others. You can hear the effect when you fill a bottle. The noise of the sloshing water gets filtered by the chamber of air between the surface and the opening: the more water, the smaller the chamber, the higher the resonant frequency of the chamber, and the tinnier the gurgle.

What we hear as different vowels are the different combinations of amplifications and filtering of the sound coming up from the larynx. These combinations are produced by moving five speech organs around in the mouth to change the shapes and lengths of the resonant cavities that the sound passes through. For example,
ee
is defined by two resonances, one from 200 to 350 cps produced mainly by the throat cavity, and the other from 2100 to 3000 cps produced mainly by the mouth cavity. The range of frequencies that a chamber filters is independent of the particular mixture of frequencies that enters it, so we can hear an
ee
as an
ee
whether it is spoken, whispered, sung high, sung low, burped, or twanged.

The tongue is the most important of the speech organs, making language truly the “gift of tongues.” Actually, the tongue is three organs in one: the hump or body, the tip, and the root (the muscles that anchor it to the jaw). Pronounce the vowels in
bet
and
butt
repeatedly,
e-uh, e-uh, e-uh
. You should feel the body of your tongue moving forwards and backwards (if you put a finger between your teeth, you can feel it with the finger). When your tongue is in the front of your mouth, it lengthens the air chamber behind it in your throat and shortens the one in front of it in your mouth, altering one of the resonances: for the
bet
vowel, the mouth amplifies sounds near 600 and 1800 cps; for the
butt
vowel, it amplifies sounds near 600 and 1200. Now pronounce the vowels in
beet
and
bat
alternately. The body of your tongue will jump up and down, at right angles to the
bet-butt
motion; you can even feel your jaw move to help it. This, too, alters the shapes of the throat and mouth chambers, and hence their resonances. The brain interprets the different patterns of amplification and filtering as different vowels.

The link between the postures of the tongue and the vowels it sculpts gives rise to a quaint curiosity of English and many other languages called phonetic symbolism. When the tongue is high and at the front of the mouth, it makes a small resonant cavity there that amplifies some higher frequencies, and the resulting vowels like
ee
and
i
(as in
bit
) remind people of little things. When the tongue is low and to the back, it makes a large resonant cavity that amplifies some lower frequencies, and the resulting vowels like
a
in
father
and
o
in
core
and in
cot
remind people of large things. Thus mice are t
ee
ny and squ
ea
k, but elephants are hum
o
ngous and
roar
. Audio speakers have small tw
ee
ters for the high sounds and large w
oo
fers for the low ones. English speakers correctly guess that in Chinese
ch’ing
means light and
ch’ung
means heavy. (In controlled studies with large numbers of foreign words, the hit rate is statistically above chance, though just barely.) When I questioned our local computer wizard about what she meant when she said she was going to
frob
my workstation, she gave me this tutorial on hackerese. When you get a brand-new graphic equalizer for your stereo and aimlessly slide the knobs up and down to hear the effects, that is
frobbing
. When you move the knobs by medium-sized amounts to get the sound to your general liking, that is
twiddling
. When you make the final small adjustments to get it perfect, that is
tweaking
. The
ob, id
, and
eak
sounds perfectly follow the large-to-small continuum of phonetic symbolism.

And at the risk of sounding like Andy Rooney on
Sixty Minutes
, have you ever wondered why we say
fiddle-faddle
and not
faddle-fiddle?
Why is it
ping-pong
and
pitter-patter
rather than
pong-ping
and
patter-pitter?
Why
dribs and drabs
, rather than vice versa? Why can’t a kitchen be
span and spic?
Whence
riff-raff, mish-mash, flim-flam, chit-chat, tit for tat, knick-knack, zig-zag, sing-song, ding-dong, King Kong, criss-cross, shilly-shally, see-saw, hee-haw, flip-flop, hippity-hop, tick-tock, tic-tac-toe, eeny-meeny-miney-moe, bric-a-brac, clickety-clack, hickory-dickory-dock, kit and kaboodle
, and
bibbity-bobbity-boo?
The answer is that the vowels for which the tongue is high and in the front always come before the vowels for which the tongue is low and in the back. No one knows why they are aligned in this order, but it seems to be a kind of syllogism from two other oddities. The first is that words that connote me-here-now tend to have higher and fronter vowels than verbs that connote distance from “me”:
me
versus
you, here
versus
there, this
versus
that
. The second is that words that connote me-here-now tend to come before words that connote literal or metaphorical distance from “me” (or a prototypical generic speaker):
here and there
(not
there and here), this and that, now and then, father and son, man and machine, friend or foe, the Harvard-Yale game
(among Harvard students),
the Yale-Harvard game
(among Yalies),
Serbo-Croatian
(among Serbs),
Croat-Serbian
(among Croats). The syllogism seems to be: “me” = high front vowel; me first; therefore, high front vowel first. It is as if the mind just cannot bring itself to flip a coin in ordering words; if meaning does not determine the order, sound is brought to bear, and the rationale is based on how the tongue produces the vowels.

Let’s look at the other speech organs. Pay attention to your lips when you alternate between the vowels in
boot
and
book
. For
boot
, you round the lips and protrude them. This adds an air chamber, with its own resonances, to the front of the vocal tract, amplifying and filtering other sets of frequencies and thus defining other vowel contrasts. Because of the acoustic effects of the lips, when we talk to a happy person over the phone, we can literally hear the smile.

Remember your grade-school teacher telling you that the vowel sounds in
bat, bet, bit, bottle
, and
butt
were “short,” and the vowel sounds in
bait, beet, bite, boat
, and
boot
were “long”? And you didn’t know what she was talking about? Well, forget it; her information is five hundred years out of date. Older stages of English differentiated words by whether their vowels were pronounced quickly or were drawn out, a bit like the modern distinction between
bad
meaning “bad” and
baaaad
meaning “good.” But in the fifteenth century English pronunciation underwent a convulsion called the Great Vowel Shift. The vowels that had simply been pronounced longer now became “tense”: by advancing the tongue root (the muscles attaching the tongue to the jaw), the tongue becomes tense and humped rather than lax and flat, and the hump narrows the air chamber in the mouth above it, changing the resonances. Also, some tense vowels in modern English, like in
bite
and
brow
, are “diphthongs,” two vowels pronounced in quick succession as if they were one: ba-eet, bra-oh.

You can hear the effects of the fifth speech organ by drawing out the vowel in
Sam
and
sat
, postponing the final consonant indefinitely. In most dialects of English, the vowels will be different: the vowel in
Sam
will have a twangy, nasal sound. That is because the soft palate or velum (the fleshy flap at the back of the hard palate) is opened, allowing air to flow out through the nose as well as through the mouth. The nose is another resonant chamber, and when vibrating air flows through it, yet another set of frequencies gets amplified and filtered. English does not differentiate words by whether their vowels are nasal or not, but many languages, like French, Polish, and Portuguese, do. English speakers who open their soft palate even when pronouncing
sat
are said to have a “nasal” voice. When you have a cold and your nose is blocked, opening the soft palate makes no difference, and your voice is the opposite of nasal.

Other books

Murder by the Book by Frances and Richard Lockridge
Christie Ridgway by Must Love Mistletoe
Games People Play by Reed, Shelby
Traveler by Melanie Jackson
Untamed by Sharon Ihle
Lamb by Bonnie Nadzam
The Lightning Keeper by Starling Lawrence
Jumper by Michele Bossley