The Half-Life of Facts (22 page)

Authors: Samuel Arbesman

BOOK: The Half-Life of Facts

7.66Mb size Format: txt, pdf, ePub

. . .

EVERYONE
recognizes the periodic table. The gridlike organization of all the known chemical elements contains a wealth of information. Each square itself holds a great many facts. For each element, we get its chemical symbol, its full name, and its atomic number and weight.

What exactly are these last two? Atomic number is simple: It represents the number of protons in the nucleus of an atom of this element and, similarly, the number of electrons that surround the nucleus. Since the number of electrons dictates a large portion of the nature of the interactions of an atom, knowing the atomic
number allows a chemist to get a reasonably good understanding of the chemical properties of the element quite rapidly.

The atomic weight, however, is a bit more tricky. When I was younger, in grade school, I learned that the atomic weight is the sum of the number of protons in the nucleus and the number of neutrons in a “normal” nucleus. However, we were also taught that the number of neutrons in each atom can vary. If an element, as defined by the number of protons, can have different numbers of neutrons in its nucleus, these different versions are known as
isotopes
.

Hydrogen, with its lone proton, is the normal isotope of hydrogen that we think of. However, there’s another version, with one proton and one neutron, known as deuterium. If you make water using deuterium, it’s known as heavy water, because the hydrogen is heavier than normal. Ice cubes of heavy water will actually sink in regular water.

So, really, what the atomic weight describes is something a good deal more complex than what I was told when I was young. The atomic weight is the “average” size of the nucleus, in proportion to the prevalence of all isotopes of that element in nature. So for element X, if there are only two isotopes, we take their relative frequency in the world and weigh these sizes accordingly. In doing so, the atomic weight yields a sort of expected weight of neutrons and protons if you were to take a chunk of that element out of the Earth and the isotopes are all neatly mixed in together.

For a long time, these atomic weights were taken as constant. They were first calculated more than one hundred years ago and propagated in periodic tables around the world, with the occasional updates to account for what was assumed to be more precise measurement. But it turns out that atomic weights vary. Which country a sample is taken from, or even what type of water the element is found in, can give a different isotope mixture.

Now that more precise measurements of the frequency of isotopes are possible, atomic weights are no longer viewed as constant. The Internal Union of Pure and Applied Chemistry recently
acknowledged this state of the world, alongside our increased ability to note small variations, and scrapped overly precise atomic weights; it now gives ranges rather than specific numbers, although many periodic tables still lack them.

Through increases in measurement, what were once thought to be infinitely accurate constants are now far fuzzier facts, just like the height of Mount Everest. But measurement’s role is not only in determining amounts or heights. Measurement (and its sibling, error) are important factors in the scientific process in general, whenever we are trying to test whether a hypothesis is true. Scientific knowledge is dependent on measurement.

. . .

IF
you ever delve a bit below the surface when reading about a scientific result, you will often bump into the term
p-value
. P-values are an integral part of determining how new knowledge is created. More important, they give us a way of estimating the possibility of error.

Anytime a scientist tries to discover something new or validate an exciting and novel hypothesis, she tests it against something else. Specifically, our scientist tests it against a version of the world where the hypothesis would not be true. This state of the world, where our intriguing hypothesis is not true and all that we see is exactly just as boring as we pessimistically expect, is known as the
null hypothesis
. Whether the world conforms to our exciting hypothesis or not can be determined by p-values.

Let’s use an example. Imagine we think that a certain form of a gene—let’s call it L—is more often found in left-handed people than in right-handed people, and is therefore associated with left-handedness. To test this, we gather up one hundred people—fifty left-handers and fifty right-handers—and test them for L.

What do we find? We find that thirty of the fifty left-handers have the genetic marker, while only twenty-two right-handers have it. In the face of this, it seems that we found exactly what we expected: left-handers are more likely to have L than right-handers. But is that really so?

The science of statistics is designed to answer this question by asking it in a more precise fashion: What is the chance that there actually is an equal frequency of left-handers with L and right-handers with L, but we simply happened to get an uneven batch? We know that when flipping a coin ten times, we don’t necessarily get exactly five heads and five tails. The same is true in the null hypothesis scenario for our L experiment.

Enter p-values. Using sophisticated statistical analyses, we can reduce this complicated question to a single number: the p-value. This provides us with the probability that our result, which appears to support our hypothesis, is simply due to chance.

For example, using certain assumptions, we can calculate what the p-value is for the above results: 0.16, or 16 percent. What this means is that there is about a one in six chance that this result is simply due to sampling variation (getting a few more L left-handers and a few less L right-handed carriers than we expected, if they are of equal frequency).

On the other hand, imagine if we had gathered a much larger group and still had the same fractions: Out of 500 left-handers, 300 carried L, while out of 500 right-handers, only 220 were carriers for L. If we ran the exact same test, we get a much lower p-value. Now it’s less than 0.0001. This means that there is less than one hundredth of 1 percent chance that the differences are due to chance alone. The larger the sample we get, the better we can test our questions. The smaller the p-value, the more robust our findings.

But to publish a result in a scientific journal, you don’t need a minuscule p-value. In general, you need a p-value less than 0.05 or, sometimes, 0.01. For 0.05, this means that there is a one in twenty probability that the result being reported is in fact not real!

Comic strip writer Randall Munroe illustrated some of the failings of this threshold for scientific publication: The comic shows some scientists testing whether jelly beans cause acne. After finding no link, someone recommends they test different colors individually. After going through numerous colors, from salmon to
orange, none are found to be related to acne, except for one: The green jelly beans are found to be linked to acne, with a p-value less than 0.05. But how many colors were examined? Twenty. And yet, explaining that this might be due to chance does little to prevent the headline declaring jelly beans linked to acne. John Maynard Smith, a renowned evolutionary biologist, once pithily summarized this approach: “Statistics is the science that lets you do twenty experiments a year and publish one false result in
Nature
.”

Our ability to measure the world and extrapolate facts from it is intimately tied to chances of error, and the scientific process is full of different ways that measurement errors can creep in. One of these, where p-values play a role, is when a fact “declines” (and sometimes even vanishes) as several different scientists try to examine the same question.

. . .

IN
the late nineteenth and early twentieth centuries, astronomers obsessed over a question that was of great importance to the solar system: the existence of Planet X.

Within a few decades after the discovery of the planet Uranus in 1781—the first planet to be discovered in modern times—a great number of oddities were noticed in its orbit. Specifically, it deviated from its orbital path quite a good deal more than could be explained by chance. Scientists realized that something was affecting its orbit, and this led to the prediction and discovery of the planet Neptune in 1846.

This predictable discovery of a new planet was a great cause for celebration: The power of science and mathematics, and ultimately the human mind, was vindicated in a spectacular way. A component of the cosmos—complete with predicted location and magnitude—was inferred through sheer intellect.

Naturally, this process cried out for repeating. If Neptune too might have orbital irregularities, perhaps this would be indicative of a planet beyond the orbit of Neptune. And so, beginning in the mid-nineteenth century, careful measurements were made in order to predict
the location and properties of what would eventually become known as Planet X.

But Planet X was a slippery thing. By the time Pluto was discovered (the presumed heir to Planet X, though Planet X is an anomaly distinct from Pluto), the mass of Planet X had been calculated no fewer than four times. As the estimating continued, based on aberrations in Neptune’s orbit, it kept getting smaller.

The first estimate was in 1901, and it showed that Planet X should be nine times the size of Earth. By 1909, it was down to only five times the size of the Earth. By 1919, Planet X was expected to be only twice the Earth’s mass.

Of course, we now know that Pluto, as mentioned earlier, is small compared to these estimates. While it’s not likely to evaporate anytime soon, whatever Dessler and Russell might say, Pluto is far smaller than the expected Planet X.

So, even after Pluto’s discovery, astronomers continued to examine the unexplained properties of Neptune’s and Uranus’s orbits, each time recognizing that Planet X didn’t have to be as large as previously thought. Now the consensus seems to be that Planet X does not exist. Due to Voyager missions we now have a much better handle on Neptune’s mass, and between that and other increasingly precise measurements, it seems that Planet X is not necessary to account for what was previously measured.

Unlike Pluto, which won’t actually vanish, it seems that Planet X already has.

Such a story in the physical sciences—where certain effects and unexplained phenomena decrease in size over time—is rare. However, when it comes to the realm of biology, social sciences, or even medicine, where measurements are not always as clear, and the results are often much more noisy (due to messy issues such as human actions), this problem is much more common. It’s known as the
decline effect.
In some situations, repeated examination of an effect or a phenomenon yields results that decrease in magnitude over time. In addition to facts themselves having a half-life, the decline effect states that facts can sometimes decay in their impact or their magnitude.

While some have made this out to be somewhat mysterious, that needn’t always be the case, as shown in the example of Planet X. Increasingly precise measurement allows us to often be more accurate in what we are looking for. And these improvements frequently dial the effects downward.

But the decline effect is not only due to measurement. One other factor involves the dissemination of measurements, and it is known as
publication bias
. Publication bias is the idea that the collective scientific community and the community at large only know what has been published. If there is any sort of systematic bias in what is being published (and therefore publicly measured), then we might only be seeing some of the picture.

The clearest example of this is in the world of negative results. If you recall, John Maynard Smith noted that “statistics is the science that lets you do twenty experiments a year and publish one false result in
Nature
.” However, if it were one experiment being replicated by twenty separate scientists, nineteen of these would be a bust, with nineteen careers unable to move forward. Annoying, certainly, and it might feel as if it were a waste of these scientists’ time, but that’s how science operates. Most ideas and experiments are unsuccessful. Scientists recognize that the ups and downs are an inherent part of the game. But crucially, unsuccessful results are rarely published.

However, for that one scientist who received an erroneous result (and its associated wonderfully low p-value), there is a great deal of excitement. Through no fault of his own—it’s due to statistics—he has found an intriguing phenomenon, and quite happily publishes it.

However, should someone else care to try to replicate his work, she might find no effect at all, or if there is one, it is likely to be smaller, if the experiment should never have been a success in the first place.

Such is the way of science. A man named John Ioannidis is one of the people who has delved even deeper into the soft underbelly of science, in order to learn more about the relationship between measurement and error.

. . .

JOHN
Ioannidis is a Greek physician and professor at the University of Ioannina School of Medicine, and he is obsessed with understanding the failings and more human properties of the scientific process. Rather than looking at anecdotal examples, such as the case of Pluto, he aggregates many cases together in order to paint a clearer picture of how we learn new things in science. He has studied the decline effect himself, finding its consistent presence within the medical literature. He has found that for highly cited clinical trials, initially significant and large effects are later found to have smaller effects or often no effect at all in a nontrivial number of instances.

Looking within the medical literature over a period of nearly fifteen years, Ioannidis examined the most highly cited studies. Of the forty-five papers he examined, seven of them (over 15 percent) initially had higher effects, and another seven were contradicted outright by later research. In addition, nearly a quarter were never even tested again, meaning that there could have been many more false results in the literature, but since no one’s tested them, we don’t know.

Other books

Edith and the Mysterious Stranger by Linda Weaver Clarke

Forever Starts Tonight by Roni Loren

Cold Hearted: A Yancy Lazarus Novel (Episode Two) by James Hunter

The Vanishing Point by Mary Sharratt

Blood Gifts by Kara Lockley

A Hot Mess by Edd McNair

La Templanza by María Dueñas

Tomato Girl by Jayne Pupek

Ready to Bear by Ivy Sinclair

Almost Never: A Novel by Daniel Sada, Katherine Silver