Statistics for Dummies (38 page)

BOOK: Statistics for Dummies

6.52Mb size Format: txt, pdf, ePub

ads

Walking through a Hypothesis Test: The Big Picture

Every hypothesis test contains a series of steps and procedures. This section gives you a general breakdown of what's involved. See
Chapter 15
for details on the most commonly used hypothesis tests, including tests that examine a claim about a single population parameter, as well as those that compare two populations.

Reviewing the general steps for a hypothesis test (one means/proportions, large samples)

Here's a boiled-down summary of the calculations involved in doing a hypothesis test. (Particular formulas needed to find test statistics for any of the most common hypothesis tests are provided in
Chapter 15
.)

Set up the null and alternative hypotheses:
1. The null hypothesis, H
  _o, says that the population parameter is equal to some claimed number.
2. Three possible alternative hypotheses exist; choose the one that's most relevant in the case where the data
  don't
  support H
  _o.
  1. H
    _a: The population parameter is
    not equal (
    ≠
    ) to
    the claimed number.
  2. H
    _a: The population parameter is
    less than (<)
    the claimed number.
  3. H
    _a: The population parameter is
    greater than (>)
    the claimed number.
Take a random sample of individuals from the population and calculate the sample statistic.
This gives your best estimate of the population parameter (see
Chapter 4
).
Convert the sample statistic to a test statistic by changing it to a standard score (all formulas for test statistics are provided in
Chapter 15
):
1. Take your sample statistic minus the number in the null hypothesis. This is the distance between the claim and your results.
2. Divide that distance by the standard error of your statistic (see
  Chapter 10
  for more on standard error). This changes the distance to standard units.
Find the
p
-value for your test statistic.
1. Find the percentage chance of being at or beyond that value in the same direction:
  1. If H
    _acontains a less-than alternative, find the percentile from
    Table 8-1
    in
    Chapter 8
    that corresponds to your test statistic.
  2. If H
    _acontains a greater-than alternative, find the percentile from
    Table 8-1
    (see
    Chapter 8
    ) that corresponds to your test statistic, and then take 100% minus that percentile. (This gives you the percentage to the right of your test statistic.)
2. Double this percentage if (and only if) H
  _ais the not-equal-to alternative.
3. Change the percentage to a probability by dividing by 100 or by moving the decimal point two places to the left. This is your
  p
  -value.
Examine your
p
-value and make your decision.
1. Smaller
  p
  -values show more evidence against H
  _o. Conclude that H
  _ois false (in other words, reject the claim).
2. Larger
  p
  -values show more evidence for H
  _o. Conclude that you can't reject H
  _o. Your sample supports the claim.
  What's the cutoff point between having or not having enough support for H
  _o? Most people find 0.05 to be a good cutoff point for accepting or rejecting H
  _o;
  p
  -values less than 0.05 show reasonable doubt that H
  _ois true. Your cutoff point is called the alpha (
  α
  ) level.

TECHNICAL STUFF

In a case where two populations are being compared, most researchers are interested in comparing the groups according to some parameter, such as the average weight of males versus females, or the proportion of women who oppose an issue compared to the proportion of men. In this case, the hypotheses are set up so you're looking at the difference between the averages or proportions, and the null hypothesis is that the difference is zero (the groups have the same means or proportions).
Chapter 15
gives formulas and examples for these hypothesis tests for both the large and small sample size cases.

Dealing with other hypothesis tests

Many types of hypothesis tests are done in the world of research. The most common ones have been included in
Chapter 15
(along with easy-to-use formulas, step-by-step explanations, and examples). But so many types of tests exist and their results come to you on an everyday basis — many of them in sound bytes, press releases, evening news broadcasts, and on the Internet.

While the hypothesis tests that researchers use can be quite varied, the main ideas (such as
p
-values and how to interpret those results) are the same.

REMEMBER

The most important element that all hypothesis tests have in common is the
p
-value. All
p
-values have the same interpretation, no matter what test is done. So anywhere you see a
p
-value, you will know that a small
p
-value means the researcher found a "statistically significant" result, which means the null hypothesis was rejected.

HEADS UP

You also know, regardless of which hypothesis test someone used, that any conclusions that are made are subject to the process of data collection and analysis being done correctly. Even then, under the best of situations, the data could still be unrepresentative just by chance, or the truth could have been too hard to detect, and the wrong decision could be made. But that's part of what makes statistics such a fun subject — you never know if what you're doing is correct, but you always know that what you're doing is right; does that make sense?

Handling smaller samples: The
t
-distribution

For means/porportions, in the case where the sample size is small (and by small, I mean dropping below 30 or so), you have less information on which to base your conclusions. Another drawback is that you can't rely on the standard normal distribution (Z-distribution) to compare your test statistic, because the central limit theorem hasn't kicked in yet. (The central limit theorem requires sample sizes that are large enough for the results to average out to a bell-shaped curve; see
Chapter 8
for more on this.) You already know you should disregard results that are based on very small sample sizes (especially those with a sample size of 1). So, what do you do in those in-between situations, in which the sample size isn't small enough to disregard and isn't large enough to use the standard normal distribution to weigh your evidence? You use a different distribution, called a
t-distribution.
(You may have heard of the term
t-test
before, in terms of hypothesis testing. This is where that term comes from.)

The t-distribution is basically a shorter, fatter version of the standard normal distribution (Z-distribution). The idea is, you should have to pay a penalty for having less information, and that penalty is a distribution that has fatter tails. To make a touchdown (getting into that magic 5% range where H
_ois rejected) with a smaller sample size is going to mean having to go farther out, proving yourself more, and having stronger evidence than you normally would if you had a larger sample size.
Figure 14-2
compares the standard normal distribution (Z-distribution) to a t-distribution.

Figure 14-2:
Comparison of the standard normal (Z-) distribution and the t-distribution.

Each sample size has its own t-distribution. That's because the penalty for having a smaller sample size, like 5, is greater than the penalty for having a larger sample size, like 10 or 20. Smaller sample sizes have shorter, fatter t-distributions than the larger sample sizes. And as you may expect, the larger the sample size is, the more the t-distribution looks like a standard normal distribution (Z-distribution); and the point where they become very similar (similar enough for jazz or government work) is about the point where the sample size is 30.
Figure 14-3
shows what different t-distributions look like for different sample sizes and how they all compare to the standard normal distribution (Z-distribution).

Figure 14-3:
t-distributions for different sample sizes.

TECHNICAL STUFF

Each t-distribution is distinguished by something statisticians call
degrees of freedom.
(Why they call it that is something that goes beyond this book.) When you're testing one population's mean and the sample size is
n
, the degrees of freedom for the corresponding t-distribution is
n
−
1. So, for example, if your sample size is 10, you use a t-distribution with 10
−
1 or 9 degrees of freedom,
denoted t
₉, rather than a Z-distribution, to look up your test statistic. (For any test that uses the t-distribution, the degrees of freedom will be given in terms of a formula involving the sample sizes. See
Chapter 15
for details.)

The t-distribution makes you pay a penalty for having a small sample size. What's the penalty? A larger
p
-value than one that the standard normal distribution would have given you for the same test statistic. That's because of the fatter tails on the t-distribution; a test statistic far out on the leaner Z-distribution has little area beyond it. But that same test statistic out on the fatter t-distribution has more fat (or area) beyond it, and that's exactly what the
p
-value represents. A bigger
p
-value means less chance of rejecting H
_o. Having less data should create a higher burden of proof, so
p
-values do work the way you'd expect them to, after you figure out what you expect them to do!

Because each sample size would have to have its own t-distribution with its own t-table to find
p
-values, statisticians have come up with one abbreviated table that you can use to get a general feeling for your results (see
Table 14-2
). Computers can also give you a precise
p
-value for any sample size.

Table 14-2:
t-Distribution

Degrees of Freedom	90th Percentile	95th Percentile	97.5th Percentile	98th Percentile	99th Percentile
1	3.078	6.314	12.706	31.821	63.657
2	1.886	2.920	4.303	6.965	9.925
3	1.638	2.353	3.182	4.541	5.841
4	1.533	2.132	2.776	3.747	4.604
5	1.476	2.015	2.571	3.365	4.032
6	1.440	1.943	2.447	3.143	3.707
7	1.415	1.895	2.365	2.998	3.499
8	1.397	1.860	2.306	2.896	3.355
9	1.383	1.833	2.262	2.821	3.250
10	1.372	1.812	2.228	2.764	3.169
11	1.363	1.796	2.201	2.718	3.106
12	1.356	1.782	2.179	2.681	3.055
13	1.350	1.771	2.160	2.650	3.012
14	1.345	1.761	2.145	2.624	2.977
15	1.341	1.753	2.131	2.602	2.947
16	1.337	1.746	2.120	2.583	2.921
17	1.333	1.740	2.110	2.567	2.898
18	1.330	1.734	2.101	2.552	2.878
19	1.328	1.729	2.093	2.539	2.861
20	1.325	1.725	2.086	2.528	2.845
21	1.323	1.721	2.080	2.518	2.831
22	1.321	1.717	2.074	2.508	2.819
23	1.319	1.714	2.069	2.500	2.807
24	1.318	1.711	2.064	2.492	2.797
25	1.316	1.708	2.060	2.485	2.787
26	1.315	1.706	2.056	2.479	2.779
27	1.314	1.703	2.052	2.473	2.771
28	1.313	1.701	2.048	2.467	2.763
29	1.311	1.699	2.045	2.462	2.756
30	1.310	1.697	2.042	2.457	2.750
40	1.303	1.684	2.021	2.423	2.704
60	1.296	1.671	2.000	2.390	2.660
Z-values	1.282	1.645	1.960	2.326	2.576

BOOK: Statistics for Dummies

6.52Mb size Format: txt, pdf, ePub

Read Book Download Book

ads

Other books

Anna of Byzantium by Tracy Barrett

Trickster by Steven Harper

The Sheikh & the Princess Bride by Mallery, Susan

Overheated by Laina Kenney

LusitanianStud by Francesca St. Claire

Tinker Bell and the Lost Treasure by Disney Digital Books

Hereward 02 - The Devil's Army by James Wilde

Only Son by Kevin O'Brien

The Born Queen by Greg Keyes

Exile (Keeper of the Lost Cities) by Messenger, Shannon