Jensen’s most recent work on Spearman’s hypothesis uses reaction time tests instead of traditional mental tests, bypassing many of the usual objections to intelligence test questions. Once again, the more g-loaded the activity is, the larger the B/W difference is, on average.
Critics can argue that the entire enterprise is meaningless because
is meaningless, but the hypothesis of a correlation between the magnitude of the
loading of a test and the magnitude of the black-white difference on that test has been confirmed.

How does the confirmation of Spearman’s hypothesis bear on the genetic explanation of ethnic differences? In plain though somewhat imprecise language: The broadest conception of intelligence is embodied in
Anything other than
is either a narrower cognitive capacity or measurement error. Spearman’s hypothesis says in effect that as mental measurement focuses most specifically and reliably on
the observed black-white mean difference in cognitive ability gets larger.
At the same time,
or other broad measures of intelligence typically have relatively high levels of heritability.
This does not in itself demand a genetic explanation of the ethnic difference, but by asserting that “the better the test, the greater the ethnic difference,” Spearman’s hypothesis undercuts many of the environmental explanations of the difference that rely on the proposition (again, simplifying) that the apparent black-white difference is the result of bad tests, not good ones.

Arguments Against a Genetic Explanation

The ubiquitous Arthur Jensen has also published the clearest evidence that the disadvantaged environment of some blacks has depressed their test scores. He found that in black families in rural Georgia, the elder sibling typically has a lower IQ than the younger.
The larger the age difference is between the siblings, the larger is the difference in IQ. The implication is that something in the rural Georgia environment was depressing the scores of black children as they grew older.
In neither the white families of Georgia, nor white or black families in Berkeley, California, are there comparable signs of a depressive effect of the environment.

But demonstrating that environment can depress cognitive development
does not prove that the entire B/W difference is environmental, and in this lies an asymmetry between the contending parties in the debate. Those who argue that genes might be implicated in group differences do not try to argue that genes explain everything. Those who argue against them—Leon Kamin and Richard Lewontin are the most prominent—typically deny that genes have
to do with group differences, a much more ambitious proposition.

If one is to make this case against a genetic factor on psychometric grounds, the data supporting Spearman’s hypothesis must be confronted. There are two ways to do so: dispute the fact itself or grant the fact but argue that it does not mean what Jensen says it does.

The most searching debate about Spearman’s hypothesis was conducted in a journal that publishes both original scholarly works and commentaries on them,
Behavioral and Brain Sciences,
where, in two separate issues in the latter 1980s, thirty-six experts in the relevant fields commented on Jensen’s evidence.
A number of comments were favorable and provided further support for Jensen’s conclusion. Others were critical, for reasons that varied from the philosophical (research into such hurtful issues is not useful) to the highly technical (were Jensen’s results the result of varying reliabilities among the tests?). We summarize them in the notes, but the striking feature was that no commentator was able to dispute the empirical claim that the racial gap in cognitive performance scores tends to be larger on tests or activities that draw most on

Several years after the exchange on Spearman’s hypothesis in
Behavioral and Brain Sciences,
Jan-Eric Gustafsson presented some data finding a considerably smaller correlation than Jensen and others do between
loading and B/W differences on a group of subtests.
It is not clear why Gustafsson obtained these atypical results, but, as of this writing, they are still atypical. We have found no others for representative groups of blacks and whites. Our own appraisal of the situation is that Jensen’s main contentions regarding Spearman’s hypothesis are intact and constitute a major challenge to purely environmental explanations of the B/W difference.

Another approach has been taken by Jane Mercer, a sociologist and the developer of the System of Multicultural Pluralistic Assessment (SOMPA). Tests are artifacts of a culture, she argues, and a culture may not diffuse equally into every household and
community. In a heterogeneous society, subcultures vary in ways that inevitably affect scores on IQ tests. Fewer books in the home means less exposure to the material that a vocabulary subtest measures; the varying ways of socializing children may influence whether a child acquires the skills, or a desire for the skills, that tests test; the “common knowledge” that tests supposedly draw on may not be common in certain households and neighborhoods.

So far, this sounds like a standard argument about cultural bias, and yet Mercer accepts the generalizations that we discussed earlier about internal evidence of bias.
She is not claiming that less exposure to books means that blacks score lower on vocabulary questions but do as well as whites on culture-free items. Rather, she argues, the effects of culture are more diffuse. Her argument may be seen as a variant of the “uniform background radiation” hypothesis that we discussed earlier.

Furthermore, she points out, strong correlations between home or community life and IQ scores are readily found. In a study of 180 Latino and 180 non-Latino white elementary school children in Riverside, California, Mercer examined eight sociocultural variables: (1) mother’s participation in formal organizations, (2) living in a segregated neighborhood, (3) home language level, (4) socioeconomic status based on occupation and education of head of household, (5) urbanization, (6) mother’s achievement values, (7) home ownership, and (8) intact biological family. She then showed that once these sociocultural variables were taken into account, the remaining correlation between ethnic group and IQ among the children fell to near zero.

The problem with this procedure lies in determining what, in fact, these eight variables control for: cultural diffusion, or genetic sources of variation in intelligence as ordinarily understood? Recall that we pointed out earlier that controlling for socioeconomic status typically reduces the B/W difference by about a third. To the extent that parental socioeconomic status is produced by parental IQ, controlling for socioeconomic status controls for parental IQ. One obvious criticism of SOMPA is that it broadens the scope of the control variables to such an extent that the procedure becomes meaningless. After the correlations between the eight sociocultural variables and IQ are, in effect, set to zero, little difference in IQ remains among her ethnic samples. But what does this mean? The obvious possibility is that Mercer has demonstrated only that parents matched on IQ will produce children with similar IQs—not a startling finding.

Mercer points out that the samples differ on the sociocultural variables even after controlling for IQ. The substantial remaining correlations indicate that “important amounts of the variance in sociocultural characteristics [are] unexplained by IQ,”
evidence, she says, that they may be treated as substantially independent of IQ.
But they are, in fact, not independent of IQ. They remain correlated. Her basic conclusion that “there is no justification for ignoring sociocultural factors when interpreting between-group differences in IQ” seems to us unchallengeable.
In the next chapter, we will present other examples of ethnic differences in social behavior that persist after controlling for IQ. But to conclude that genetic differences are ruled out by her analysis is unwarranted, because she cannot demonstrate that a family’s sociocultural characteristics are independent of their IQ.

Scholars of Jensen’s school point to a number of other difficulties with Mercer’s interpretation. When she concludes that cultural diffusion explains the black-white difference, the data she uses show the familiar pattern of Spearman’s hypothesis: The more a test loads on
the greater is the B/W difference.
Why should cultural diffusion manifest itself in such a patterned way? Her appeal to sociocultural factors does not explain why blacks score lower on backward digit span than forward; why in chronometric tests, black movement time is faster, but reaction time slower, than among whites; or why the B/W difference persists on nonverbal tests such as the Ravens Standard Progressive Matrices. It is also not explained why, if the role of European white cultural diffusion (or the lack of it) is so important in depressing black test performance, it has been so unimportant for Asians.

A number of authors besides Mercer have advanced theories of cultural difference, often treated as part of the “cultural bias” argument but asserting in more sweeping fashion that cultures differ in ways that will be reflected in test scores. In the American context, Wade Boykin is one of the most prominent academic advocates of a distinctive black culture, arguing that nine interrelated dimensions put blacks at odds with the prevailing Eurocentric model. Among them are spirituality (blacks approach life as “essentially vitalistic rather than mechanistic, with the conviction that non-material forces influence people’s everyday lives”); a belief in the harmony between humankind and nature; an emphasis on the importance of movement, rhythm, music, and dance “which are taken as central to psychological health”; personal styles that he characterizes as “verve” (high levels of stimulation and energy) and
“affect” (emphasis on emotions and expressiveness); and “social time perspective,” which he defines as “an orientation in which time is treated as passing through a social space rather than a material one.”
The notes reference a variety of other authors who have made similar arguments.
All, in different ways, purport to explain how large B/W differences in test scores could coexist with equal predictive validity of the test for such things as academic and job performance and yet still not be based on differences in “intelligence,” broadly defined, let alone genetic differences.

John Ogbu, a Berkeley anthropologist, has proposed a more specific version of this argument. He suggests that we look at the history of various minority groups to understand the sources of differing levels of intellectual attainment in America. He distinguishes three types of minorities: “autonomous minorities” such as the Amish, Jews, and Mormons, who, while they may be victims of discrimination, are still within the cultural mainstream; “immigrant minorities,” such as the Chinese, Filipinos, Japanese, and Koreans within the United States, who moved voluntarily to their new societies and, while they may begin in menial jobs, compare themselves favorably with their peers back in the home country; and, finally, “castelike minorities,” such as black Americans, who were involuntary immigrants or otherwise are consigned from birth to a distinctively lower place on the social ladder.
Ogbu argues that the differences in test scores are an outcome of this historical distinction, pointing to a number of castes around the world—the untouchables in India, the Buraku in Japan, and Oriental Jews in Israel—that have exhibited comparable problems in educational achievement despite being of the same racial group as the majority.

Indirect support for the proposition that the observed B/W difference could be the result of environmental factors is provided by the worldwide phenomenon of rising test scores.
We call it “the Flynn effect” because of psychologist James Flynn’s pivotal role in focusing attention on it, but the phenomenon itself was identified in the 1930s when testers began to notice that IQ scores often rose with every successive year after a test was first standardized. For example, when the Stanford-Binet IQ was restaridardized in the mid-1930s, it was observed that individuals earned lower IQs on the new tests than they got on the Stanford-Binet that had been standardized in the mid-1910s; in other words, getting a score of 100 (the population average) was harder to do
on the later test.
This meant that the average person could answer more items on the old test than the new test. Most of the change has been concentrated in the nonverbal portions of the tests.

The tendency for IQ scores to drift upward as a function of years since standardization has now been substantiated, primarily by Flynn, in many countries and on many IQ tests besides the Stanford-Binet.
In some countries, the upward drift since World War II has been as much as a point a year for some spans of years. The national averages have in fact changed by amounts that are comparable to the fifteen or so IQ points separating whites and blacks in America. To put it another way, on the average, whites today may differ in IQ from whites, say, two generations ago as much as whites today differ from blacks today. Given their size and speed, the shifts in time necessarily have been due more to changes in the environment than to changes in the genes.

The question then arises: Couldn’t the mean of blacks move 15 points as well through environmental changes? There seems no reason why not—but also no reason to believe that white and Asian means can be made to stand still while the Flynn effect works its magic.

