The Half-Life of Facts (3 page)

Read The Half-Life of Facts Online

Authors: Samuel Arbesman

BOOK: The Half-Life of Facts
3.4Mb size Format: txt, pdf, ePub

Figure 1. A linear (black) versus exponential (gray) curve versus sine (dotted) curve.

As you might have realized, exponential growth is very rapid. Even if we are initially adding only a small amount to some quantity each hour or day, that quantity can become very big very quickly. Imagine we are given a penny and begin doubling it each day. After a week we would be receiving less than a $1.50 a day. But give it one more week. Now we’re getting more than $80 a day. Within a month our allowance is more than $100 million a day!

Exponential growth gets its name from the use of an exponent: an
exponent signifies how many times to multiply another number, the base, by itself. Many times a special constant is used for the base; in the case of exponential growth it is often
e
. Also known as Napier’s constant, it is about 2.72. It’s one of those numbers, like π, that crops up in the weirdest situations, from bacteria doubling to infinitely long sums of numbers. The exponent part of the equation includes what is known as its rate of growth. The larger this value, the faster the quantity grows, and the faster it doubles.

.   .   .

THE
exponential growth curve was well-known to Price, so when he began to measure the heights of his stacks of journals, he knew immediately what was going on. But maybe he just happened to have gotten the only stack of journals that obeyed this curious pattern. So he began collecting lots of data, a research style that he followed throughout his life.

He measured the number of journal articles in the physics literature in general, as well as for more specialized fields, such as the subfield that deals with linear algebra. And they all seemed to have elements of the exponential curve. Price began to recognize that this could be a new way to think about how science grows and develops. Price published his findings, under the title “Quantitative Measures of the Development of Science,” in a small French journal in 1951, after presenting this work at a conference the previous year in Amsterdam.

No one was interested.

But Price wasn’t deterred. He returned to Cambridge and continued to pursue his research in this new field, the quantitative study of science, or
scientometrics
, as it soon became known. This science of science was still quite young, but Price set himself to collecting vast quantities of data to help him understand how science changes.

By the 1960s, he was the foremost authority in this field. He gathered data from all aspects of science and marshaled evidence
that enabled him to look at scientific growth as something far from haphazard; this knowledge was subject to regular laws.

Expanding on his initial research on scientific journals, he gathered data for a wide variety of areas that displayed this growth, from chemistry to astronomy. Price calculated the doubling times—how long it takes for something to double, a proportional increase that implies exponential growth—for these components of science and technology, which then can be used as a rough metric for seeing how different types of facts change over time. Here is a selection of these doubling times from his 1963 book
Little Science, Big Science
:

Domain
Doubling Time (in years)
Number of entries in a dictionary of national biography
100
Number of universities
50
Number of important discoveries; number of chemical elements known; accuracy of instruments
20
Number of scientific journals; number of chemical compounds known; memberships of scientific institutes
15
Number of asteroids known; number of engineers in the United States
10

The growth of facts was finally beginning to be subjected to the rigors of mathematics.

.   .   .

PARALLEL
to Price’s work in the hard sciences, a similar line of research was proceeding in the social sciences. In 1947, a psychologist named Harvey Lehman published a curious little paper in the journal
Social Forces
. Combing through a wide variety of dictionaries, encyclopedias, and chronologies, Lehman set out to count
the number of major contributions made in a wide variety of areas of study over the years. He looked at everything from genetics and math to the arts, whether new scientific findings, new theorems, or even new operas produced. What he found in all of these were exponential increases in output over time. But this wasn’t only over the previous few decades. Lehman looked at each of these areas over hundreds of years. He examined philosophy over the six hundred years from 1275 to 1875, botany over the three hundred years from 1600 to 1900, and geology over the four hundred years from 1500 to 1900.

Each area was found to have a characteristic rate of increase. Here are doubling times (the number of years it takes for the yearly contributions in these fields to double) from Lehman’s findings, along with a few more recent areas examined:

Field
Doubling Time (in years)
Medicine and hygiene
87
Philosophy
77
Mathematics
63
Geology
46
Entomology
39
Chemistry
35
Genetics
32
Grand opera
20

Independently, a number of thinkers were coming to the realization that the growth of knowledge was subject to patterns, and was far from random. Similarly, different types of growth fit different types of knowledge creation. For example, opera is a far faster-changing domain than the sciences. Even though science and opera composition are inherently creative, science is limited by what we can determine about nature. Science can develop only as quickly as
we can figure out things about the world. Grand opera, however, is not limited by what is true, only by what is beautiful, and should therefore be able to grow more rapidly, since it doesn’t have to be rigorously subjected to experimentation.

In addition, we can see a hint of how more fundamental discoveries grow by comparing them to ones that are more dependent on other areas, which build on work done in other fields. For example, genetics and chemistry, two areas of the basic sciences, proceed at similar rates. On the other hand, medicine and hygiene are much slower, and are also areas that rely on these more basic fields for new discoveries. Perhaps this is a hint that more derivative fields move more slowly compared to the more basic areas of knowledge on which they depend.

Price’s and Lehman’s efforts showed that looking at how knowledge grows in a systematic way was finally possible, and they unleashed a wave of discoveries.

.   .   .

PRICE’S
approach, looking at how science progresses by examining scientific articles and their properties, has proven to be the most successful and fastest-growing area of scientometrics. While scientific progress isn’t necessarily correlated with a single publication—some papers might have multiple discoveries, and others might simply be confirming something we already know—it is often a good unit of study.

Focusing on the scientific paper gives us many pieces of data to measure and study. We can look at the title and text and, using sophisticated algorithms from computational linguistics or text mining, determine the subject area. We can look at the authors themselves and create a web illustrating the interactions between scientists who write papers together. We can examine the affiliations of each of the authors and try to see which collaborations between individuals at different institutions are more effective. And we can comb through the papers’ citations, in order to get a sense of the research a paper is building upon.

Examining science at the level of the publication can give us all manner of exciting results. A group of researchers at Harvard Medical School looked at tens of thousands of articles published by its scientists and mapped out the buildings on campus where they worked. Through this, they were able to look at the effect that distance has on collaboration. They found exactly what they had assumed but no one had actually measured: The closer two people are, the higher the impact of the research that results from that collaboration. They found that just being in the same building as your collaborators makes your work better.

We can also understand the impact of papers and the results within them by measuring how many other publications cite them. The more important a work is, the more likely it is to be referenced in many other papers, implying that it has had a certain foundational impact on the work that comes after it. While this is certainly an imperfect measure—you can cite a paper even if you disagree with it—much of the field of scientometrics is devoted to understanding the relationship between citations, scientific impact, and the importance of different scientists.

Using this sort of approach, scientometrics can even determine what types of teams yield research that has the highest impact. For example, a group of researchers at Northwestern University found that high-impact results are more likely to come from collaborative teams rather than from a single scientist. In other words, the days of the lone hero scientist, along the lines of an Einstein, are vanishing, and you can measure it.

Citations can also be used as building blocks for other metrics. By examining the average number of times articles in a given journal are cited, we can get what is known as the
impact factor
. This is widely used and carefully considered: Scientists want their papers to be published in journals with high impact factors, as it is good both for their research and influences decisions such as funding and tenure. The journals with the highest impact factors have even penetrated the public consciousness—no doubt due to the highly cited individual papers within them—and include the
general science publications such as
Nature
and
Science
, as well as high-profile medical journals such as the
New England Journal of Medicine
.

Scientometrics has even given bragging tools to scientists, such as the
h-index
, which measures the impact of a paper on other researchers. It was created by Jorge Hirsch (and named after himself; notice the
h
) and essentially counts the number of articles a scientist has published that have been cited at least that many times. If you have an h-index value of 45, it means that you have forty-five articles that have each been cited at least forty-five times (though you have likely published many more articles that have been cited fewer times). It also has the side benefit of meaning that you are statistically more likely to be a fellow of the National Academy of Sciences, a prestigious U.S. scientific organization.

It shouldn’t be surprising that the field of scientometrics has simply exploded in the past half century. While Price and his colleagues labored by hand, tabulating citations manually and depending on teams of graduate students to do much of the thankless grunt work, we now have massive databases and computers that can take a difficult analysis project and do it much more easily. For example, the h-index is now calculated automatically by many scientific databases (including Google Scholar), something inconceivable in previous decades. Due to this capability, we now have scientometric results about nearly every aspect of how science is done. As we spend billions of dollars annually on research, and count on science to do such things as cure cancer and master space travel, we have the tools to begin to see what sorts of research actually work.

Scientometrics can demonstrate the relationship between money and research output. The National Science Foundation has examined how much money a university spends relative to how many articles its scientists publish. Other studies have looked at how age is related to science. For example, over the past decades, the age at which scientists receive grants from the National Institutes of Health has increased, causing a certain amount of concern among younger scientists.

There’s even research that examines how being a mensch is related to scientific productivity. For example, in the 1960s, Harriet Zuckerman, a sociologist of science—someone who studies the interactions and people underlying the entire scientific venture—decided to study the scientific output of Nobel laureates to see if any patterns could be seen in how they work that might distinguish them from their less successful peers. One striking finding was the beneficence of Nobel laureates, or as Zuckerman termed it,
noblesse oblige
. In general, when a scientific paper is published, the author who did the most is listed first. There are exceptions to this, and this can vary from field to field, but Zuckerman took it as a useful rule of thumb. What she found was that Nobel laureates are first authors of numerous publications early in their careers, but quickly begin to give their junior colleagues first authorship. And this happens far before they receive the Nobel Prize.

Other books

Torn by Kenner, Julie
30 Nights by Christine d'Abo
Social Skills by Alva, Sara
Lessons in French by Hilary Reyl
The New Girl by Tracie Puckett
Capital Crimes by Stuart Woods