Statistics for Dummies (44 page)

Authors: Deborah Jean Rumsey

Tags: #Non-Fiction, #Reference

BOOK: Statistics for Dummies

7.97Mb size Format: txt, pdf, ePub

Carrying out a survey

The survey has been designed, and the participants have been selected. Now you have to go about the process of carrying out the survey, which is another important step, one where lots of mistakes and bias can occur.

Collecting the data

During the survey itself, the participants can have problems understanding the questions, they may give answers that aren't among the choices (in the case of a multiple choice question), or they may decide to give answers that
are inaccurate or blatantly false. (As an example of this third type of error, where respondents provide false information, think about the difficulties involved in getting people to tell the truth about whether they've cheated on their income-tax forms.) This third type of error is called
response bias
— the respondent gives a biased answer.

Some of the potential problems with the data-collection process can be minimized or avoided with careful training of the personnel who carry out the survey. With proper training, any issues that arise during the survey are resolved in a consistent and clear way, and no errors are made in recording the data. Problems with confusing questions or incomplete choices for answers can be resolved by conducting a pilot study on a few participants prior to the actual survey, and then, based on their feedback, fixing any problems with the questions. Personnel can also be trained to create an environment in which each respondent feels safe enough to tell the truth; ensuring that privacy will be protected also helps encourage more people to respond.

Following up, following up, and following up

Anyone who has ever thrown away a survey or refused to "answer a few questions" over the phone knows that getting people to participate in a survey isn't easy. If the researcher wants to minimize bias, the best way to handle this is to get as many folks to respond as possible by following up, one, two, or even three times. Offer dollar bills, coupons, self-addressed stamped return envelopes, chances to win prizes, and so on. Every little bit helps.

What has ever motivated you to fill out a survey? If the incentive provided by the researcher didn't persuade you (or that feeling of guilt for just taking those two shiny quarters and dumping the survey in the trash didn't get to you) maybe the subject matter peaked your interest. This is where bias comes in. If only those folks who feel very strongly respond to a survey, that means that only their opinions will count, because the other people who didn't really care about the issue didn't respond, and their "I don't care" vote didn't get counted. Or maybe they did care, but they just didn't take the time to tell anyone. Either way, their vote doesn't count.

For example, suppose 1,000 people are given a survey about whether the park rules should be changed to allow dogs on leashes. Who would respond? Most likely, the respondents would be those who strongly agree or disagree with the proposed rules. Suppose 100 people from each of the two sides of the issue were the only respondents. That would mean that 800 opinions were not counted. Suppose none of those 800 people really cared about the issue either way. If you could count their opinions, the results would be 800 ÷ 1,000 = 80% "no opinion", 100 ÷ 1,000 = 10% in favor of the new rules and 100 ÷ 1,000 = 10% against the new rules. But without the votes of the 800 non-respondents, the researchers would report, "Of the people who responded, 50% were in favor of the new rules and 50% were against them." This gives the impression of a very different (and a very biased) result from the one you would've gotten if all 1,000 people had responded.

Lying: What do they know?

A study published in the
Journal of Applied Social Psychology
concluded that when lying to someone is in the best interest of the person hearing the lie, lying becomes more socially acceptable, and when lying to someone is in the best interest of the liar himself/herself, the lying becomes less socially acceptable. This sounds interesting and seems to make sense, but can this be true of everyone? The way the results are stated, this appears to be the case. However, in looking at the people who actually participated in the survey leading to these results, you begin to get the feeling that the conclusions may be a bit ambitious, to say the least.

The authors started out with 1,105 women who were selected to participate in the survey. Of these, 659 refused to cooperate, most of them saying they didn't have time. Another 233 were determined by the researchers to be either "too young" or "too old", and 33 were deemed unsuitable because the researchers sited a language barrier. In the end, 180 women were questioned. The average age of the final group of participants was 34.8 years.

Wow, where do I start with this one? The original sample size of 1,105 seems large enough, but were they selected randomly? Notice that the entire sample consisted of women, which is interesting, because the conclusions don't say that lies are more or less acceptable according to women in these situations. Next, 659 of those selected to participate refused to cooperate (causing bias). This is a large percentage (60%), but given the subject matter, you shouldn't be surprised. The researchers could have minimized the problem by guaranteeing that the responses would be anonymous, for example. No information is given regarding whether any follow-up was done (which probably means that it wasn't).

Throwing out 233 people because they were "too young" or "too old" is just plain wrong, unless your target population is limited to a certain age group. If that had been the case, the conclusions should have been made about that age group only. Finally, the last straw: throwing out 33 people who were "unsuitable" (in the researchers' own terms) because of a language barrier. I'd say, "Bring in an interpreter", because the conclusions were not limited to only those who speak English. You can't have the survey participants represent only a tiny microcosm of society (young women who speak English), and then turn around and make conclusions about all of society, based only on the data from this tiny microcosm. Starting out with a sample size of 1,105 and ending up with only 180 women is just plain bad statistics.

HEADS UP

The
response rate
of a survey is a ratio found by taking the number of respondents divided by the number of people who were originally asked to participate. Statisticians feel that a good response rate is anything over 70%. However, many response rates fall far short of that, unless the survey is done by a reputable organization, such as The Gallup Organization. Look for the response rate when examining survey results. If the response rate is too low (much less than 70%) the results may be biased and should be ignored. Don't be fooled by a survey that claims to have a large number of respondents but actually has a low response rate; in this case, many people may have responded, but many more were asked and didn't respond.

Note that many statistical formulas (including the formulas in this book) assume that your sample size is equal to the number of respondents, because
statisticians want you to know how important it is to follow up with people and not end up with biased data due to non-response. However, in reality, statisticians know that you can't always get everyone to respond, no matter how hard you try. So, which number do you put in for
n
in all the formulas: the intended sample size (the number of people contacted) or the actual sample size (the number of people who responded)? Use the number of people who responded. Note, however, that for any survey with a low response rate, the results shouldn't be reported, because they very well could be biased. That's how important following up really is. (Do other people heed this warning when they report their results to you? Not often enough.)

REMEMBER

Regarding the quality of results, selecting a smaller initial sample size and following them up more aggressively is a much better approach than selecting a larger group of potential respondents and having a low response rate.

Interpreting results; detecting problems

The purpose of a survey is to gain information about your target population; this information can include opinions, demographic information, or lifestyles and behaviors. If the survey has been designed and conducted in a fair and accurate manner with the goals of the survey in mind, the data should provide good information as to what's happening with the target population (within the stated margin of error). The next steps are to organize the data to get a clear picture of what's happening; analyze the data to look for links, differences, or other relationships of interest; and then to draw conclusions based on the results.

Organizing and analyzing

After a survey has been completed, the next step is to organize and analyze the data (in other words, crunch some numbers and make some graphs). Many different types of data displays and summary statistics can be created and calculated from survey data, depending on the type of information that was collected. (Numerical data, such as income, have different characteristics and are usually presented differently than categorical data, such as gender.) For more information on how data can be organized and summarized, see
Chapters 4
and
5
(respectively). Depending on the research question, different types of analyses can be performed on the data, including coming up with population estimates, testing a hypothesis about the population, or looking for relationships, to name a few. See
Chapters 13
,
15
, and
18
for more on each of these analyses, respectively.

HEADS UP

Watch for misleading graphs and statistics. Not all survey data are organized and analyzed fairly and correctly. See
Chapter 2
for more about how statistics can go wrong.

HEADS UP

Anonymity versus confidentiality

If you were to conduct a survey to determine the extent of personal e-mail usage at work, the response rate would probably be an issue, because many people are reluctant to discuss their use of personal e-mail in the workplace, or at least to do so truthfully. You could try to encourage people to respond by letting them know that their privacy would be protected during and after the survey.

When you report the results of a survey, you generally don't tie the information collected to the names of the respondents, because doing so would violate the privacy of the respondents. You've probably heard the terms "anonymous" and "confidential" before, but what you may not realize is that these two words are completely different in terms of privacy issues. Keeping results
confidential
means that I could tie your information to your name in my report, but I promise that I won't do that. Keeping results
anonymous
means that I have no way of tying your information to your name in my report, even if I wanted to.

If you're asked to participate in a survey, be sure you're clear about what the researchers plan to do with your responses and whether or not your name can be tied to the survey. (Good surveys always make this issue very clear for you.) Then make a decision as to whether you still wish to participate.

Drawing conclusions

The conclusions are the best part of any survey; this is why the researchers do all of the work in the first place. If the survey was designed and carried out properly, the sample was selected carefully — and the data were organized and summarized correctly — the results will fairly and accurately represent the reality of the target population. But, of course, not all surveys are done right. But even if a survey is done correctly, researchers can misinterpret or over-interpret results so that they say more than they really should. You know the saying, "Seeing is believing"? Some researchers are guilty of the converse, which is, "Believing is seeing." In other words, they claim to see what they want to believe about the results. All the more reason for you to know where the line is drawn between reasonable conclusions and misleading results, and to realize when others have crossed that line.

Here are some of the most common errors made in drawing conclusions from surveys:

Making projections to a larger population than the study actually represents
Claiming a difference exists between two groups when a difference isn't really there
Saying that "these results aren't scientific, but
…
", and then going on to present the results as if they are scientific

Getting too excited?

In 1998, a press release put out by the search engine Excite stated that they were named the best-liked Web site in a
USA Today
study conducted by Intelliquest. The survey was based on 300 Web users selected from a group of 30,000 technology panelists who worked for Intelliquest. (Note that this is
not
a random sample of Web users!) The conclusions stated that Excite won the overall consumer experience category, making it the best-liked site on the Web, beating out Yahoo! and the other competitors.

Excite claimed that it was better than Yahoo! based on this survey. The actual results, however, tell a different story. The average overall quality score, on a scale of 0 to 100%, was 89% for Excite and 87% for Yahoo!. Excite's score is admittedly good, and it is slightly higher than the score obtained for Yahoo!; however, the difference between the results for the two companies are actually well within the margin of error for this survey, which is plus or minus 3.5%. In other words, Excite and Yahoo! were in a statistical tie for first place. So saying which company actually came in first isn't possible in this case. (See
Chapters 9
and
10
for more on sample variation and the margin or error.)

To avoid common errors made when drawing conclusions, do the following:

Check whether the sample was selected properly and that the conclusions don't go beyond the population presented by that sample.
Look for disclaimers about surveys
before
reading the results, if you can.
That way, you'll be less likely to be influenced by the results you're reading, if, in fact, the results aren't based on a scientific survey. Now that you know what a
scientific survey
(the media's term for an accurate and unbiased survey) actually involves, you can use those criteria to judge for yourself whether the survey results are credible.
Be on the lookout for statistically incorrect conclusions.
If someone reports a difference between two groups in terms of survey results, be sure that the difference is larger than the reported margin of error. If the difference is within the margin of error, you should expect the sample results to vary by that much just by chance, and the so-called "difference" can't really be generalized to the entire population. (See
Chapter 14
for more on this.)
Tune out anyone who says, "These results aren't scientific, but
…
."

HEADS UP

Here's the bottom line about surveys. Know the limitations of any survey and be wary of any information coming from surveys in which those limitations aren't respected. A bad survey is cheap and easy to do, but you get what you pay for. Before looking at the results of any survey, investigate how it was designed and conducted, so that you can judge the quality of the results.

Other books

Elogio de la vejez by Hermann Hesse

The Terrible Privacy Of Maxwell Sim by Jonathan Coe

Behind Hitler's Lines by Thomas H. Taylor

Dust by Mandy Harbin

22 Nights by Linda Winstead Jones

Bad Marie by Dermansky, Marcy

Mahashweta by Sudha Murty

Moon Palace by Paul Auster

After Brock by Binding, Paul

Bound To Die: A Cozy Mystery (Strawberry Shores Mystery Book 1) by Mak K. Han