The Half-Life of Facts (14 page)

Read The Half-Life of Facts Online

Authors: Samuel Arbesman

BOOK: The Half-Life of Facts
2.95Mb size Format: txt, pdf, ePub

Simkin and Roychowdhury conclude, using some elegant math, that only about 20 percent of scientists who cite an article have actually read that paper. This means that four out of five scientists never take the time to track down a publication they intend to use to buttress their arguments. By examining these mutations we can trace these errors backward in time, and understand how knowledge truly spread from scientist to scientist, instead of how it appeared to spread.

We can even see the spread of such misinformation in a somewhat more lighthearted context. If you had an e-mail address in the late 1990s, you were likely the recipient of a letter that looked something like this:

This is for anyone who thinks NPR/PBS is a worthwhile expenditure of $1.12/year of their taxes…. A petition follows. If you sign, please forward it on to others. If not, please don’t kill it—send it to the e-mail address listed here:
[email protected]. PBS, NPR (National Public Radio), and the arts are facing major cutbacks in funding.

In case it isn’t immediately clear, this is a chain letter. It is one of the less insidious types, as some are much more overtly hoaxes that promise good luck if you spread it, bad luck (or death!) if you ignore it, and the like. While certainly not an accurate piece of knowledge, these types of letters circulate for a very long time.

Once again, we can use letters to understand how errors spread by examining how they circulate before dying out. This was the question that David Liben-Nowell and Jon Kleinberg, both computer scientists, set out to answer.

Liben-Nowell and Kleinberg compiled a massive collection of different versions of the chain letter shown above, as well as a petition purporting to organize opposition to the Iraq war (neither of these letters were entirely factual and had their roots in hoaxes). Through a Web site, they asked for volunteers to search their e-mail archives for variations of them. In their database, they found all the hallmarks of biological-style textual mutation. From their paper:

Some recipients reordered the list of names on their copy of the letter in ways closely analogous to the kinds of chromosomal rearrangements one finds due to sequence mutation events in biological settings. We observed examples of point mutations (in some petition copies, names were replaced by the names of political figures), insertion/deletion events (there were a number of small blocks of 1–5 names that were present in the middle of the list in some petition copies and absent in other copies), duplication events (blocks of 2–20 names that were duplicated in some petition copies, sometimes immediately adjacent within the list and sometimes hundreds of names later), block rearrangements (in
one petition, two pairs of blocks of 2–3 names were swapped relative to their position in all other copies that contained the same names), and one hybridization event (the names at the ends of two copies of the petition were intermingled after their common prefix in a third copy).

But just as these mutations and errors can be used to understand how knowledge spreads, another element of these letters can be used to explore the branching and spreading of information. Using the signatories (the people who spread it), which are included in the data, we can see how false knowledge makes its way through a population. By looking at how the letters accumulated signatures, the researchers were able to trace their spread and see who sent the letters to whom.

And they found something that contradicts our intuition about social networks. While we are embedded in highly clustered social circles, ones that also have the property of connecting everyone within a handful of hops (that six degrees of separation again), the spread of these chain letters does not have the feel of an epidemic. Rather than the letters spreading to hundreds of individuals, who in turn each spread it to hundreds of additional recipients themselves, and so forth, it was much more tame. They only spread successfully to one or two people at each step. So while they spread for a long time, slowly burning through a tiny sample of the population, they didn’t create any sort of rapid, massive conflagration.

This can be a good thing. While a fact or, more important, an incorrect fact—whether the iron content of spinach or the proper scientific name of a long-necked dinosaur—might be able to percolate through a population, and slowly weave its way through a group, it won’t necessarily spread widely. The downside, though, is that it might linger. It can hop from person to person, lasting far longer than we might expect, even if it only affects a tiny subset of the group.

Happily, it is often the case that credible information or news spreads faster and wider than what is false. But no matter the speed
with which an error takes hold, rooting it out can be a very difficult process, as it’s hard in everyday life to trace the error back to its source and disabuse each person at every step of their wrong information.

.   .   .

FACTS
do not spread instantaneously, even with modern technology. They weave their way through social networks in mathematically predictable ways. Along the way they can also mutate and become filled with errors, again in a reliable manner. Errors can continue to spread, lasting much longer than we might realize. Soon enough, the knowledge in a single area is filled with facts but also with the ejecta from a single burst of errata, making it difficult to know what is true.

Luckily, there is a simple remedy: Be critical before spreading information and examine it to see what is true. Too often not knowing where one’s facts came from and whether it is well-founded at all is the source of an error. We often just take things on faith.

The modern origins of empirical scientific knowledge lie in the sixteenth and seventeenth centuries. This time period, known as the Scientific Revolution, saw advances such as Newton’s theory of gravitation, Boyle’s gas laws, Hooke’s recognition that all living things are made of cells, and the beginnings of the Royal Society—a scientific group that exists to this day. The spirit that infused this time period brought forth a whole host of new knowledge, and the disproving of facts that had existed for centuries, if not millennia. The Scientific Revolution has made the swift changes in modern-day knowledge possible.

But some of the most important components of this endeavor were to try to eliminate errors and create a means of spreading correct facts. Many of the papers presented in the early years at the Royal Society were devoted to trying to understand errors, to root out misunderstandings, or to test the veracity of tales told to them that often seemed too good to be true. For example, here is a characteristically wordy title of a paper published in 1753 in the
Philosophical Transactions
of the Royal Society: “Experimental
Examination of a White Metallic Substance Said to Be Found in the Gold Mines of the Spanish West-Indies, and There Known by the Appellations of Platina, Platina di Pinto, Juan Blanca.” No doubt some man of science had heard of this mysterious white metallic substance from these gold mines and its properties (it appears to be platinum) and felt it important to examine it.

Anything that was heard they tried to test and to eliminate errors in it, however long they had persisted. Most important, they didn’t keep this new knowledge secret. They spread it far and wide, publishing it and disseminating it through the loose network of natural philosophers of Europe.

One’s knowledge is dependent upon it being knowable to you specifically, on it having been spread to you. As we’ve seen, this spread relies on social networks, and sometimes on the all-too-human tendency to corrupt information as it spreads. But as long as we remain true to the spirit of the Scientific Revolution, by not taking things on faith and by spreading true facts, we are far from being overwhelmed with error.

But sometimes, even with the massive advances in technology and our ability to disseminate knowledge—whose modern origins are found in Gutenberg’s Mainz—facts sometimes don’t spread as far as they should. Therein lies the curious situation of hidden knowledge.

CHAPTER 6
Hidden Knowledge

MY
father, Harvey Arbesman, is a dermatologist and an epidemiologist. He spends about half of his time seeing patients, diagnosing and treating skin cancer, and the other half doing research. As a researcher, he is fond of the unexpected hypothesis and the counterintuitive concept. This has led him to publish on such topics as whether malignant melanomas are associated with the increased use of antibacterial soaps and whether dairy consumption is related to acne. Essentially, he is drawn to the tough challenge. This research style led him to InnoCentive.

Alpheus Bingham was the vice president of research and development strategy at the pharmaceutical company Eli Lilly when he began thinking about experts and how they solve problems. He realized that while an expert might solve a hard problem 20 percent of the time, simply giving it to five experts won’t always yield results. There’s a good chance that all the experts will fail.

But what if this pool of people was made much wider? Perhaps, Bingham argued, there was a “long tail of expertise” (his term, not mine) of lots of people who are all interested in solving a technical problem but each of whom has a very small chance of success. Using this logic, as long as you get a really large group there’s a decent chance that the problem will be solved. The math sounds like it should work out, but would it really work in practice?

Bingham, with the support of Eli Lilly, created a separate company called InnoCentive, which is designed to test this hypothesis. InnoCentive acts as a clearinghouse between organizations or companies that have problems and solvers—those people from all areas of life who are interested in solving problems and can work better in the aggregate than the experts.

Bingham’s intuition was right: InnoCentive works. It works because it draws on solutions and insights from different fields. Often the solver is involved in a technical discipline that is near the area of the problem but just different enough to be distinct. With this distinctiveness comes the potential for informational import and export. A fact or solution might be well-known in one area, but it is still an entirely open question in the other. This allows people who might not be experts to bring what they know in their field and apply it to other areas. A sort of fact recombination—where ideas are brought together in new ways—is often the way that problems are solved at InnoCentive.

For example, when Roche brought a problem that it had been working on for fifteen years, the crowd recapitulated all the possible solutions that the company had already tried, and in only sixty days. But even better, there was an actual working solution among the proposals, something the company had failed to find. When NASA used InnoCentive, they quickly got the answer to a problem that had been bothering them for thirty years! Instead of working in the same area for a quarter of a century, you can open up the question to a larger group and get an answer from an unexpected source. And more important, an unexpected field.

So my father took a look at InnoCentive to see if it had any interesting problems that he could try to solve. While it was originally populated by engineers, chemists, and other technically minded individuals, it is rapidly expanding and broadening its focus, including into the life sciences, so my father felt at home on the Web site.

While sifting through InnoCentive’s e-mail challenges, my father came across one that intrigued him. The Prize4Life Foundation, a
nonprofit organization devoted to curing and treating amyotrophic lateral sclerosis (ALS), also known as Lou Gehrig’s disease, was offering a prize for a stepping-stone toward an eventual cure.

Curing ALS is hard. Rather than tackling the problem wholesale, since the final solution is essentially unknowable, Prize4Life broke down its goals into smaller tasks. One of the first major problems they wanted to solve was the creation of something known as a biomarker, a way of measuring the progression of the disease. For many diseases, even those for which there is no known cure, there are clear ways of measuring how far along the disease is within an individual. For example, in HIV research, one biomarker used is the level of cytokines, small molecules found in the immune system. However, there weren’t any known biomarkers for ALS based on chemicals or anything internal to the patient. The only way to determine the progression of this disease used some established ways of measuring what the patient was still able to do, and these correlated with how long the patient had left to live.

Prize4Life’s InnoCentive challenge sought promising hypotheses for potential, noninvasive biomarkers, with a prize of $15,000. The organization also created another challenge, which offered a prize of $1 million if the effectiveness of a biomarker could actually be demonstrated through testing in patients.

My father is not a neurologist, nor does he have any specialized knowledge in the area of neurodegenerative diseases. However, he is an expert in something else: undiscovered public knowledge.

In the mid-eighties, a professor of library science named Don Swanson realized that, for all of our presumed prowess at organizing knowledge, we were falling profoundly short. While we had made great strides from the days of Linnaeus and his taxonomy of living things, as massive amounts of information have become digitized, we were becoming confronted with a filtering problem.

Remember, this was the middle of the 1980s. The first graphical Web browser was nearly a decade away. But Swanson presciently realized that sifting through information successfully is a
far from trivial task, and even if we navigate all that we find, there was still much knowledge that was being ignored.

While technological knowledge constantly increases, as shown in
chapter 4
, in other areas knowledge sometimes churns, as when scientific facts are overturned. Sometimes what we know can suffer a regress, such as when the library of Alexandria was destroyed. Similarly, there is often knowledge that is in fact hidden in plain sight, yet to be discovered and used. These findings, which Swanson termed
undiscovered public knowledge
, are extreme special cases of those examples from
chapter 5
, in which knowledge is not spread far and wide.

Other books

Barbarian's Soul by Kayse, Joan
Penhallow by Georgette Heyer
Shadow Unit 15 by Emma Bull, Elizabeth Bear
Safe in His Arms by Dana Corbit
Kissing In Cars by Sara Ney
The Laughing Matter by William Saroyan
The MORE Trilogy by T.M. Franklin
Twisted Miracles by A. J. Larrieu