Read The Story of Psychology Online
Authors: Morton Hunt
It is not an image but a coded
representation
of the image, somewhat as the patterns of magnetism on a tape recording are not sounds but a coded representation of sounds. The representation, however, is not yet a perception; the primary visual cortex is in no sense the end of the visual path. It is just one stage in the processing of the information it handles.
From the striate region the partly assembled and integrated information is sent to other areas of the visual cortex and to higher areas of brain cortex beyond it. There, the information is finally seen by the mind and recognized as something familiar or something not seen before. How
that takes place is still moot, according to most neuroscientists. A few, however, boldly guess that somewhere at the higher brain levels are cells that contain “traces” of previously seen objects in the form of synaptic connections or molecular deposits, and these cells respond when an incoming message matches the trace. The response to a match is an awareness (“I know that face”); a nonmatch produces no response, which is also an awareness (“I don’t know that face”).
75
The neural approach tells us much about the workings of visual perception at the micro level but little at the macro level, much about the machinery of vision but little about its owner and operator, much about neuronal responses but little about the
experience
of perception. As one cognitive theorist put it, “Trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers.”
76
The cognitive approach
deals with the mental processes at work in such perceptual phenomena as shape constancy, feature identification, form recognition, cue-derived depth perception, recognition of figures when much of the information is missing, and so on.
The mental processes that yield these results are made up of billions of neuronal events, but cognitive theorists say that it takes macrotheories, not microtheories, to explain these processes. A physicist studying how and when a wave changes form and breaks as it nears the shore cannot derive the laws of wave mechanics from the interactions of trillions of water molecules, not even with a number-crunching mainframe computer. Those laws express mass effects that exist at a wholly different level of organization. The sounds made by a person talking to us are made up of vibrations of the molecules of atmospheric gases, but the meaning of the words cannot be explained in those terms.
So too with the mental processes of visual perception; they are organized mass effects of neural phenomena expressed by mental, not neurophysiological, laws. We have already seen evidence of this, but there is one particularly intriguing and historic example worth discussing. What happens, and at what level, when we call up an image from memory and see it in the mind’s eye? Experiments by cognitive theorists show that this can be explained only in high-level cognitive terms. The most elegant and impressive of such experiments are those of Roger Shepard (now emeritus) of Stanford University on “mental rotation.” Shepard asked subjects to say whether the objects in each of these three pairs are identical:
FIGURE 36
Mental rotation: Which pairs are identical?
Most people recognize, after studying them for a little while, that the objects in A are identical as are those in B. Those in C are not. When asked how they reached their conclusions, they say that they rotated the objects in their minds much as if they were rotating real objects in the real world. Shepard demonstrated how closely this procedure mirrors real rotation by another experiment, in which viewers saw a given shape in degrees of angular difference. This set, for example, shows a single shape in a series of positions:
FIGURE 37
Mental rotation: The greater the distance, the longer it takes.
When subjects were shown pairs of these figures, the time it took them to identify them as the same was proportional to the angular difference in the positions of the figures; that is, the more one figure had to be rotated to match the other, the longer it took for identification.
77
This is only one of many perceptual phenomena that involve higher mental processes operating on internalized symbols of the external world. For some years a number of perception researchers have been trying to formulate a comprehensive cognitive theory of what those processes are and how they produce those perceptions.
There are two schools of thought about how to do this. One uses concepts and procedures drawn from artificial intelligence (AI), a branch of computer science. The basic assumption of AI is that human mental activities can be simulated by step-by-step computer programs—and take place in that same step-by-step programmed way.
78
Partly in the effort to make computers recognize what they are looking at, and partly to gain a better understanding of human perception, AI experts have written a number of form-recognition programs. To achieve elementary form recognition—to recognize triangles, squares, and other regular polygons, for instance—a program might follow a series of if-then steps. If there is a straight line, then follow it and measure it to its end; if another line continues from there, then call that point a corner and measure the angle by which it changes direction; if that other line is straight, follow it until… and so on, until the number of sides and angles has been counted and matched against a list of polygons and their characteristics.
The chief argument in favor of the AI approach to visual perception is that there is no projector or screen in the brain and no homunculus looking at pictures; hence the mind must be dealing not with images but coded data that it processes step by step, as a computer program does.
Fifteen years ago the chief argument against the AI idea was that no existing program of machine vision had more than a minuscule capacity, compared with that of human beings, to recognize flat shapes, let alone three-dimensional ones, or to know where they are within the environment, or to recognize the probable physical qualities of the rocks, chairs, water, bread, or people it was seeing. But since then there have been extraordinary developments in machine vision. Formerly limited to two-dimensional representation, it is now capable of 3-D, and methods of identifying shapes and distances have greatly improved. Robots guided by machine vision now run operations in a great many factories; AI systems using machine vision have guided driverless automobiles across the desert, avoiding obstacles and ravines; security systems can now match a seen face to a photograph of that face, and so on.
Having said all that, it remains true that machine vision has only a very limited capacity, compared with that of human beings, to recognize all sorts of objects
for what they are;
it doesn’t
understand
, it doesn’t
know
, it doesn’t
feel.
Basically, that’s because it isn’t hooked up to the immense information base of the human mind: its vast store of mental and emotional responses built in by evolution, its immense accumulation of learned meanings of perceptions, its huge compilation of interconnected information about the world. As remarkable as the achievements of the designers of machine vision are, their work has led to a greater understanding of how to make machine vision work but not to a deeper understanding of how human vision works.
The other school of thought about how cognitive perceptual processes work has long relied and continues to rely on laboratory studies of human thinking rather than machine simulations of thinking. This view, going far beyond the Helmholtz tradition that perception is the result of unconscious inference from incomplete information, includes conscious thought processes of other kinds. Its leading exponent in recent years was Irvin Rock (1922–1995) of the University of California at Berkeley. His book,
The Logic of Perception
, was described in the
Annual Review of Psychology
as “the most inclusive and empirically plausible explanation of perceptual effects that seem to require intelligent activity on the part of the perceiver.”
79
Rock, though an outstanding perception psychologist, was far from outstanding in his early undergraduate years; in fact, in an intellectual family he was the black sheep. But during World War II his unit was dive-bombed by enemy planes, he felt sure he would be killed, and “I vowed to myself,” he said, “that if I survived I would try to do more with my life than I had until then.”
80
After the war he became a top-notch student. He began graduate school in physics but switched to psychology when he realized that there was greater opportunity in that young field for a significant contribution to knowledge.
At the New School for Social Research Rock fell under the spell of the Gestaltists who were there and became an ardent one himself. Certain basic Gestalt laws of organization and relational thinking are still part of his theory. But those laws describe essentially automatic processes, and Rock came to believe that many perceptual phenomena could be accounted for only by mental processes of a thoughtlike character.
81
This idea first occurred to him when he conducted the 1957 experiment, described above, in which he tilted a square so that it looked like a diamond, then tilted the perceiver. Since the perceiver still saw the square as a diamond, Rock reasoned that he must have used visual and visceral cues to interpret what he saw. Rock spent many years devising and conducting other experiments to test the hypothesis that, more often than not, perception requires higher-level processes than those taking place in the visual cortex. These studies led him, finally, to the thesis that “perception is intelligent in that it is based on operations similar to those that characterize thought.”
82
And indeed, Rock has said, perception may have led to thought; it may be the evolutionary link between low-level sensory processes in primitive organisms and high-level cognitive processes in more complex forms of life. If what the eye sees, he argues, is an ambiguous and distortion-prone representation of reality, some mechanism had to evolve to yield reliable and faithful knowledge of that reality. In his words, “Intelligent operations may have evolved in the service of perception.”
83
This is not to say that all perception is thoughtlike; Rock specifically cited the waterfall illusion as explicable in low-level neural terms. But most facts about motion perception and other kinds of perception seemed to him to require high-level processes. Unconscious inference, as in our use of texture gradient cues to sense distance, is only one of them.
84
Description that results in interpretation is another. In the ambiguous old hag–young woman figure by Boring, what one sees is not the result of simply recognizing an image but of describing to oneself what a particular curve is like: like a nose or like a cheek. Many perceived forms or objects are not instantly recognizable; recognizing what they are comes about through such a process.
85
Perception also often calls for problem solving of one sort or another. One hardly thinks of perception as the solving of problems, but Rock marshaled a considerable amount of evidence—much from earlier studies by others, some from his own original experiments—to show that in many cases we seek a hypothesis to account for what we see, weigh that hypothesis against other possibilities, and choose the one that seems to solve the problem of making sense of what we see. All of this usually takes place in a fraction of a second.
One example: In a laboratory phenomenon known since the time of Helmholtz, if a wavelike curved line is passed horizontally behind a slit,
as in the above figure, most observers first see it as a small element moving up and down, but after a while some of them will suddenly see the sinuous line moving at right angles to and behind the aperture. What produces their altered and correct perception? Rock found that one clue they use is the changing slope of the line as it passes the slit; another is the end of the curved line, if it comes into view. Such clues suggest to the mind an alternative hypothesis—that a curve is moving past the slit horizontally, rather than that a small element is moving up and down. This hypothesis is so much better that the mind accepts it and sees the line as it really is.
86