Read Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man Online
Authors: Mark Changizi
Tags: #Non-Fiction
Footsteps are, however, more subtle than a simple vertical thud on the ground, and there are multiple avenues in which Doppler shifts may occur. First, even if your footsteps stomp the ground without any forward speed, your
body
is still moving forward. When the footstep sound waves rise into the air, your big old body bumps into them and reflects them. The waves that your front runs into get reflected forward and are consequently Doppler upshifted, and the waves that reflect off your back get Doppler downshifted, as illustrated in Figure 36b (i). Second, footsteps aren’t the simple vertical ground bangers that Figure 36a would have us believe. For one thing, the dynamics of a footstep are complicated by the fact that the
ground
is often complicated. Throughout most of our evolutionary history the ground was not smooth as it is today, but covered in grass, brush, and pebbles. When your foot goes in for a landing, it has some forward velocity, and will undergo a sequence of microcollisions with the ground material. This sequence of collisions is a forward-moving sequence, and will sound higher in pitch (or compressed) to a listener toward whom the mover is directed. (See Figure 36b [ii].) Not only is the ground more complicated than the “stomping” in Figure 36a indicates, but the foot dynamics even on
smooth
ground are more subtle than I have let on. Our heel hits first, and the contact points move forward along the foot toward the toes (see Figure 36b [iii]). Whether on smooth ground or natural terrain, this sequence of microhits underlying a single step is moving in the direction of the mover, and thus will Doppler shift.
Figure 36
.
(a)
When a foot hits the ground, it is not moving forward or backward, and therefore has no Doppler shift. But as we’ll discuss, there’s more to the story.
(b)
(i)
Top: The footstep leads to sound waves going in all directions, all at the same frequency (indicated by the spacing between the wave fronts). Bottom: These waves hit the body and reflect off it. Because the mover is moving forward, the sound waves reflected forward will be Doppler shifted to a higher pitch; the waves hitting the mover’s rear will reflect at a lower pitch.
(ii)
When feet land they don’t simply move vertically downward for a thud. The surface of the ground very often has complex material on it, which the landing foot strikes as it is still moving forward. These complex sounds will have Doppler shifts.
(iii)
If our feet were like a pirate’s peg leg, then the single thud it makes when hitting the ground would have no Doppler shift. But our feet aren’t peg legs. Instead, our foot lands on its heel, and the point of contact tends to move forward along the bottom of the foot.
Footsteps can, then, Doppler shift, and these shifts are detectable. There is now a third difficulty that can be raised: if Doppler shifts for human movers are fairly meager, then why doesn’t musical melody have meager tessitura width (i.e., meager pitch range for the melody)? The actual tessitura in melody tends to be wider than that achievable by a human mover, corresponding to speeds faster than humans can achieve. Why, if melodic pitch contours are about Doppler pitches, would music exaggerate the speed of the depicted observer? Perhaps for the same reason that exaggeration is commonplace in other art forms. Facial expressions in cartoons, for example, tend to be hyperexaggerations of human facial expressions. Presumably such exaggerations serve as superstimuli, hyperactivating our brain’s mechanisms for detecting the characteristic (e.g., smile or speed), and something about this hyperactivation feels good to us (perhaps a bit like being on a roller coaster).
One final thought concerning the mismatch between the Doppler pitch range and the tessitura widths found in music: could I have been underestimating the size of the Doppler shifts humans are capable of? Although we may only move in the one- to ten-meters-per-second range, and our limbs may swing forward at a little more than twice our body’s speed,
parts of us
may be moving at faster speeds. Recall that your feet hit the ground from heel to toe. The sequence of microhits travels forward along the bottoms of your feet, and the entirety of sound the sequence makes will be Doppler shifted. An interesting characteristic of this kind of sound is that it can act like a sound-making object that is moving much faster than the object that actually generates it. As an example, when you close scissors, the actual objects—the two blades—are simply moving toward each other. But the
point of contact
between the blades moves
outward
along the blades. The sound of closing scissors is a sound whose principal source location is moving, even though no object is actually moving in that manner. This kind of faux-moving sound maker can go very fast. If two flattish surfaces hit each other with one end just ever so slightly ahead of the other, then the speed of the faux mover can be huge. For example, if you drop a yardstick so as to make it land flat, and one end hits the ground one millisecond before the other end, then the faux mover will have traveled between the yardstick and the ground from one end to the other at about one kilometer per second, or about two thousand miles per hour! The faux mover beneath our stepping feet may, in principle, be moving much faster than we are, and any scissor-like sound it makes will thus acquire a Doppler pitch range much wider than that due to our body’s natural speed.
Human movers
do
make sounds that Doppler shift, and these shifts are detectable by our auditory system. And their exaggeration in music is sensible in light of the common role of exaggeration in artistic forms. Melodic contour, we have seen thus far, has many of the signature properties expected of Doppler shifts, lending credence to the idea that the role of melodic pitch contours is to tell the story of the sequence of directions in which a mover is headed. That’s a fundamental part of the kinematic information music imparts about the fictional mover. But that’s only half the story of “kinemusic.” It doesn’t tell us how far away the mover is, something more explicitly spatial. That is the role of loudness, the topic of the rest of this chapter.
Loud and in 3-D
Do you know why I love going to live shows like plays or musicals? Sure, the dialogue can be hilarious or touching, the songs a hoot, the action and suspense thrilling. But I go for another reason: the 3-D stereo experience. Long before movies were shot and viewed in 3-D, people were putting on real live performances, which provide a 3-D experience for all the two-eyeds watching. And theater performances don’t simply approximate the 3-D experience—they are the genuine article.
“But,” you might respond, “one goes to the theater for the dance, the dialogue, the humans—for the art. No one goes to live performances for the ‘3-D feel!’ What kind of lowbrow rube are you? And, at any rate, most audiences sit too far away to get much of a stereo 3-D effect.”
“Ah,” I respond, “but that’s why I sit right up front, or go to very small theater houses. I just
love
that 3-D popping-out feeling, I tell ya!”
At this point you’d walk away, muttering something about the gene pool. And you’d be right. That
would
be a dopey thing for me to say. We see people doing their thing in 3-D all the time. I just saw the waitress here at the coffee shop walk by. Wow, she was in 3-D! Now I’m looking at my coffee, and my mug’s handle appears directed toward me. Whoa, it’s 3-D!
No. We don’t go to the live theater for the 3-D experience. We get plenty of 3-D thrown at us every waking moment. But this leaves us with a mystery. Why
do
people like 3-D movies? If people are all 3-D’ed out in their regular lives, why do we jump at the chance to wear funny glasses at the movie house? Part of the attraction surely is that movies can show you places you have never been, whether real or imaginary, and so with 3-D you can more fully experience what it is like to have a
Tyrannosaurus rex
make a snout-reaching grab for you.
But there is more to it. Even when the movie is showing everyday things, there is considerable extra excitement when it is in 3-D. Watching a live performance in a tiny theater is still not the same as watching a 3-D movie version of that same performance. But what is the difference?
Have you ever been to one of those shows where actors come out into the audience? Specific audience members are sometimes targeted, or maybe even pulled up onstage. In such circumstances, if you’re
not
the person the actors target, you might find yourself thinking, “Oh, that person is having a blast!” If you’re the shy type, however, you might be thinking, “Thank God they didn’t target me because I’d have been terrified!” If you
are
the target, then, whether you liked it or not, your experience of the evening’s performance will be very different from that of everyone else in the audience. The show reached out into
your
space and grabbed
you
. While everyone else merely watched the show, you were part of it.
The key to understanding the “3-D movie” experience can be found in this targeting. 3-D movies differ from their real-life versions because
everyone
in the audience is a target, all at the same time. This is simply because the 3-D technology (projecting left- and right-eye images onto the screen, with glasses designed to let each eye see only the image intended for it) gives everyone in the audience the
same
3-D effect. If the dragon’s flames appear to me to nearly singe my hair but spare everyone else’s, your experience at the other side of the theater is that the dragon’s flames nearly singe
your
hair and spare everyone else’s, including mine. If I experience a golf ball shooting over the audience to my left, then the audience to my left also experiences the golf ball going over
their
left. 3-D movies put on a show that is inextricably tied to each listener, and invades each listener’s space equally. Everyone’s experience is identical in the sense that they’re all treated to the same visual and auditory vantage point. But everyone’s experience is unique because each experiences
himself
as the target—each believes he has a specially targeted vantage point.
The difference, then, between a live show seen up close and a 3-D movie of the same show is that the former pulls just one or several audience members into the thick of the story, whereas 3-D movies have this effect on
everyone
. So the fun of 3-D movies is not that they are 3-D at all. We can have the same fun when we happen to be the target in a real live show. The fun is in being
targeted
. When the show doesn’t merely leap off the screen, but leaps at
you
, it fundamentally alters the emotional experience. It no longer feels like a story about others, but becomes a story that invades your space, perhaps threateningly, perhaps provocatively, perhaps joyously. You are immersed in the story, not an audience member at all.
What does all this have to do with music and the auditory sense? Imagine yourself again at a live show. You hear the performers’ rhythmic banging ganglies as they carry out behaviors onstage. And as they move onstage and vary their direction, the sounds they make will change pitch due to the Doppler effect. Sitting there in the audience, watching from a vantage point
outside
of the story, you get the rhythm and pitch modulations of human movers. You get the attitude (rhythm) and action (pitch). But you are not immersed in the story. You can more easily remain detached.
Now imagine that the performers suddenly begin to target you. Several just jumped off the stage, headed directly toward you. A minute later, there you are, grinning and red-faced, with tousled hair and the bright red lipstick mark of a mistress’s kiss on your forehead . . . and, for good measure, a pirate is in your face calling you “salty.” During all this targeting you hear the gait sounds and pitch modulations of the performers, but you also heard these sounds when you were still in detached, untargeted audience-member mode. The big auditory consequence of being targeted by the actors is not in the rhythm or pitch, but in the
loudness
. When the performers were onstage, most of the time they were more or less equidistant, and fairly far away—and so there was little loudness modulation as they carried on. But when the performers broke through the “screen,” they ramped up the volume. It is these high-loudness parts of music—the fortissimos, or
ff
s—that are often highly evocative and thrilling, as when the dinosaur reaches out of the 3-D theater’s screen to get you.
And that’s the final topic of this chapter: loudness, and its musical meaning. I will try to convince you that loudness modulations are used in music in the 3-D, invade-the-listener’s-space fashion I just described. In particular, this means that the loudness modulations in music tend to mimic loudness modulations due to changes in the
proximity
of a mover. Before getting into the evidence for this, let’s discuss why I don’t think loudness mimics something
else.
Nearness versus Stompiness
I will be suggesting that loudness in music is primarily driven by spatial proximity. Rather than musical pitch being a spatial indicator, as is commonly suggested (see the earlier section “Why Pitch Seems Spatial”), it is
loudness
in music that has the spatial meaning. As was the case with pitch, here, too, there are several stumbling blocks preventing us from seeing the spatial meaning of loudness. The first is the bias for pitch: if one mistakenly believes that pitch codes for space, then loudness must code for something else. A second stumbling block to interpreting loudness as spatial concerns musical notation, which codes loudness primarily via letters (
pp, p, mf, f, ff
, and so on), rather than as a spatial code (which is, confusingly, how it codes pitch, as we’ve seen). Musical instruments throw a third smokescreen over the spatial meaning of loudness, because most instruments modulate loudness not by spatial modulations of one’s body, but by hitting, bowing, plucking, or blowing
harder
.
Therefore, several factors are conspiring to obfuscate the spatial meaning of loudness. But, in addition, the third source of confusion I just mentioned suggests an alternative interpretation: that loudness concerns the energy level of the sound maker. A musician must use more energy to play more loudly, and this can’t help but suggest that louder music might be “trying” to sound like a more energetic mover. The energy with which a behavior is carried out is an obvious real-world source of loudness modulations. These energy modulations are, in addition, highly informative about the behavior and expressions of the mover. A stomper walking nearby means something different than a tiptoer walking nearby. So energy or “stompiness” is a potential candidate for what loudness might mean in music.
Loudness in the real world can, then, come both from the
energy
of a mover and from the
spatial proximity
of the mover. And each seems to be the right sort of thing to potentially explain why the loudest parts of music are often so thrilling and evocative: stompiness, because the mover is energized (maybe angry); proximity, because the mover is very close by. Which of these ecological meanings is more likely to drive musical loudness, supposing that music mimics movement? Although I suspect music uses high loudness for both purposes—sometimes to describe a stompy mover, and sometimes to describe a nearby mover—I’m putting my theoretical money on spatial proximity.