Statistics can go wrong for many different reasons. First, a simple, honest error could have taken place. This can happen to anyone, right?
Other times, the error is a little more than a simple, honest mistake. In the heat of the moment, because someone feels strongly about a cause and because the numbers don't quite bear out the point that the researcher wants to make, statistics get tweaked, or, more commonly, they get exaggerated, either in terms of their values or in the way they're represented and discussed.
Finally, you may encounter situations in which the numbers have been completely fabricated and could not be repeatable by anyone because the results never happened. This is the worst-case scenario, and it does happen in the real world.
This section gives you tips to help you spot errors, exaggerations, and lies, along with some examples of each type of error that you, as an information consumer, may encounter.
The first thing you want to do when you come upon a statistic or the result of a statistical study is to ask the question, "Is this number correct?" Don't assume that it is! You may be surprised at the number of simple arithmetic errors that occur when statistics are collected, summarized, reported, or interpreted. Keep in mind that another type of error is an
error of omission
— information that is missing that would have made a big difference in terms of getting a handle on the real story behind the numbers. That makes the issue of correctness difficult to address, because you're lacking information to go on.
Tip | To spot arithmetic errors or omissions in statistics:
|
Many statistics break results down into groups, showing the percentage of people in each group who responded in a certain way regarding a particular question or demographic factor (such as age, gender, and so on). That's an effective way to report statistics, as long as all the percentages add up to 100%.
For example,
USA Today
reported the results of an opinion research study done for Tupperware regarding microwaving leftovers. The story reported that 28% of the people surveyed said they microwave leftovers almost daily, 43% said they microwave leftovers two to four times a week, and 15% said they do it once a week. Assuming that everyone should fit into these results somewhere, the percentages should add up to 100, or close to it. Quickly checking, the total is 28% + 43% + 15% = 86%. What happened to the other 14%? Who was left out? Where do they fall in the mix? No one knows. The statistics just aren't adding up.
Another item you can check quickly is whether the total number of respondents is given. As a quick example, you may remember the Trident gum commercials which said that "Four out of five dentists surveyed recommend Trident gum for their patients who chew gum." This commercial is quite a few years old, but recently it has been revived in a funny series of new commercials asking, "What happened to that fifth dentist?" and then showing some incidents that might have happened to the fifth dentist that stopped him or her from pushing the "yes" button. But here is the real question: How many dentists were really surveyed? You don't know, because the survey doesn't tell you. You can't even check the fine print, because in the case of this type of advertising, none is required.
Why would knowing the total number of respondents make a difference? Because the reliability of a statistic is, in part, due to the amount of information that went into the statistic (as long as it was good and correct information). When the advertisers say "four out of five dentists", there may have actually been only five dentists surveyed. Now, maybe 5,000 were surveyed, and in that
case, 4,000 of them recommended the gum. The point is, you don't know how many dentists actually recommended the gum unless you do more detective work to find out. In most cases, the burden is on you, the consumer, to find that information. Unless you know the total number of people who took part in the study, you can't get any perspective on how reliable this information could be.
Even when you uncover an error in a statistic, you may not be able to determine whether the error was an honest mistake, or if someone with an agenda was conveniently stretching the truth. By far the most common abuse of statistics is a subtle, yet effective, exaggeration of the truth. Even when the math checks out, the underlying statistics themselves can be misleading; they could be unfair, stretch the truth, or exaggerate the facts. Misleading statistics are harder to pinpoint than simple math errors, but they can have a huge impact on society, and, unfortunately, they occur all the time.
When spotting misleading statistics, you want to question the type of statistic used. Is it fair? Is it appropriate? Does it even make practical sense? If you're worried only about whether the numbers add up or that the calculations were correct, you could be missing a bigger error in that the statistic itself is measuring the wrong characteristic.
Crime statistics are a great example of how statistics are used to show two sides of a story, only one of which is really correct. Crime is often discussed in political debates, with one candidate (usually the incumbent) arguing that crime has gone down during his or her tenure, and the challenger often arguing that crime has gone up (giving the challenger something to criticize the incumbent for). How can two political candidates talk about crime going in two different directions? Assuming that the math is correct, how can this happen? Well, depending on the way that you measure crime, it would be possible to get either result.
Table 2-1
shows the number of crimes in the United States reported by the FBI from 1987 to 1997.
Year | Number of Crimes |
---|---|
1987 | 13,508,700 |
1988 | 13,923,100 |
1989 | 14,251,400 |
1990 | 14,475,600 |
1991 | 14,872,900 |
1992 | 14,438,200 |
1993 | 14,144,800 |
1994 | 13,989,500 |
1995 | 13,862,700 |
1996 | 13,493,900 |
1997 | 13,175,100 |
Is crime going up or down? It appears to be moving down in general, but you could look at these data in different ways and present these numbers in ways that make the trend look different. The big question is, do these data tell the whole story?
For example, compare 1987 to 1993. In 1987 an estimated 13,508,700 crimes took place in the United States, and in 1993, the total number of crimes was 14,144,800. It looks like crime went up during those six years. Imagine if you were a candidate making a challenge for the presidency; you could build a platform around this apparent increase in crime. And if you fast-forward to 1996, the total number of crimes in that year was estimated to be 13,493,900, which is only slightly less than the total number of crimes in 1987. So, was very much done to help curb crime during the nine-year period from 1987 to 1993? In addition, these numbers don't tell the whole story. Is the total number of crimes for a given year the most appropriate statistic to measure the extent of crime in the United States?
Another piece of important information has been left out of the story (and believe me, this happens more often than you may think)! Something else besides the number of crimes went up between 1987 and 1993: the population of the United States. The total population of the country should also play a role in the crime statistics, because when the number of people living in the country increases, you'd also expect the number of potential criminals and potential crime victims to increase. So, to put crime into perspective, you must account for the total number of people as well as the number of crimes. How is this done? The FBI reports a crime index, which is simply a crime rate. A
rate
is a ratio; it's the number of people or events that you're interested in, divided by the total number in the entire group.
Statistics have a variety of different units in which they are expressed, and this variety can be confusing.
A
ratio
is a fraction that divides two quantities. For example, "The ratio of girls to boys is 3 to 2" means that for every 3 girls, you find 2 boys. It doesn't mean that only 3 girls and 2 guys are in the group; ratios are expressed in lowest terms (simplified as small as possible). So you could have 300 girls and 200 guys; the ratio would still be 3 to 2.
A
rate
is a ratio that reflects some quantity per a certain unit. For example, your car goes 60 miles per hour, or a neighborhood burglary rate is 3 burglaries per 1,000 homes.
A
percentage
is a number between 0 and 100 that reflects a proportion of the whole. For example, a shirt is 10% off, or 35% of the population is in favor of a four-day work week. To convert from a percent to a decimal, divide by 100 or move the decimal over two places to the left. To remember this more easily, just remember that 100% is equal to 1, or 1.00, and to get from 100 to 1 you divide by 100 or move the decimal over 2 places to the left. (And just do the opposite to change from a decimal to a percent.)
Percentages can be used to determine how much a value increases or decreases, relatively speaking. Suppose the crimes in one city went up from 50 to 60, while the number of crimes in another city went up from 500 to 510. Both cities had an increase of 10 crimes, but for the first city, this difference is much larger, as a percentage of the total number of crimes. To find the percentage increase, take the "after" amount, minus the "before" amount and divide that result by the "before" amount. For the first city, this means crime went up by (60 – 50) ÷ 50 = 10 ÷ 50 = 0.20 or 20%. For the second city, this change reflects only a 2% increase, because (510
−
500) ÷ 500 = 10 ÷ 500 = 0.02 or 2%. To find percentage decrease, do the same steps. You'll just get a negative number, indicating a decrease.
Table 2-2
shows the estimated population of the U.S. for 1987–1997, along with the estimated number of crimes and the estimated
crime rates
(crimes per 100,000 people).
Year | Number of Crimes | Estimated Population Size | Crime Rate (Per 100,000 People) |
---|---|---|---|
1987 | 13,508,700 | 243,400,000 | 5,550.0 |
1988 | 13,923,100 | 245,807,000 | 5,664.2 |
1989 | 14,251,400 | 248,239,000 | 5,741.0 |
1990 | 14,475,600 | 248,710,000 | 5,820.3 |
1991 | 14,872,900 | 252,177,000 | 5,897.8 |
1992 | 14,438,200 | 255,082,000 | 5,660.2 |
1993 | 14,144,800 | 257,908,000 | 5,484.4 |
1994 | 13,989,500 | 260,341,000 | 5,373.5 |
1995 | 13,862,700 | 262,755,000 | 5,275.9 |
1996 | 13,493,900 | 265,284,000 | 5,086.6 |
1997 | 13,175,100 | 267,637,000 | 4,922.7 |
Looking again at 1987 compared to 1993, you can see that the number of crimes increased from 13,508,700 in 1987 to 14,144,800 in 1993. (Note that this represents a 4.7% increase, because 14,144,800
−
13,508,700 equals 636,100, and if you divide this number by the original value, 13,508,700, you get 0.047, which is 4.7%.) So, looking at it this way, someone may report that crime went up 4.7% from 1987 to 1993. But this 4.7% represents an increase in the
total number
of crimes, not the number of crimes
per person
, or the number of crimes per 100,000 people. To find out how the number of crimes per 100,000 people changed over time, you need to calculate and compare the crime rates for 1987 and 1993. Here's how: (5,484.4 – 5,550.0) ÷ 5,550.0 = –65.6 ÷ 5,550.0 = –0.012 = –1.2%. The crimes per 100,000 people (crime rate) actually
decreased
by 1.2%