Below is a
2*3 cross table (the row variable is gender, the column variable is smoking
status)

Never smoke

Former smoker

Current smoker

Total

Male

N_{11}

N_{12}

N_{13}

R_{1}

Female

N_{21}

N_{22}

N_{23}

R_{2}

Total

C_{1}

C_{2}

C_{3}

N

The Chisquare test is the primary method
used for testing the independence of the two categorical variables.
The null hypothesis is the two variables are
independent (have no relationship).
Under this hypothesis, the expected frequency for each cell can be
calculated, for example, in the above table the expected frequency for N_{11}
is
E_{11}=
R_{1}*C_{1}/N
Then compare the observed versus the
expected, calculate the X^{2 }statistics.
The
value of the teststatistic is
where
O_{i} = an observed
frequency;
E_{i} = an expected
(theoretical) frequency, asserted by the null hypothesis;
n = the number of
cells in the table.
Χ^{2} = Pearson's
cumulative test statistic, which asymptotically approaches a Χ^{2} distribution.
If the Χ^{2}
value is high with very low probability, say only 5%, we say the results are
"statistically significant" at the ".05 or 5% level" and we
reject null hypothesis and accept the alternative one that the two variables
are related.
Fisher’s
exact test
Fisher’s exact test provides exact pvalue.
Chisquare
test is only an approximation because the sampling distribution of the test
statistic that is calculated is only approximately equal to the theoretical
chisquared distribution. The approximation is inadequate when sample sizes are
small, or the data are very unequally distributed among the cells of the table,
resulting in the cell counts predicted on the null hypothesis (the
"expected values") being low.
For small,
sparse, or unbalanced data, the exact and asymptotic pvalues can be quite different and may lead to opposite
conclusions concerning the hypothesis of interest.^{ }
Exact pvalue becomes difficult to calculate with
large samples or wellbalanced tables. Fortunately these are exactly the
conditions where the chisquare test is appropriate.
Empower will automatically conduct Fisher’s exact test
if the sample size is small and/or the observed frequency are sparse or
unbalanced.
Below is the sample
input window
Below is the sample
output of the above model:
R*C frequency table and Chisquare
test
Cell Contents

 N 
 N / Row Total 
 N / Col Total 
 N / Table Total 

Total Observations in Table: 642
 HBP
SNP1  0  1  Row Total 

0  30  3  33 
 0.909 
0.091  0.051 
 0.053 
0.041  
 0.047 
0.005  

1  185  28  213 
 0.869 
0.131  0.332 
 0.325 
0.384  
 0.288 
0.044  

2  354  42  396 
 0.894 
0.106  0.617 
 0.622 
0.575  
 0.551 
0.065  

Column Total  569  73  642 
 0.886 
0.114  

Statistics for All Table Factors
Pearson's Chisquared test

Chi^2 = 1.065719 d.f. = 2 p
= 0.5869243
Fisher's Exact Test for Count Data

Alternative hypothesis: two.sided
p = 0.5916893
In the above cross table, each cell have 4 numbers,
the first is N (frequency), next is row percentage (N/Row total), next is
column percentage (N/Col Total), the last one is the percentage (N/Table
Total).
X2 is 1.065719, degree of freedom is 2, P value is 0.5869243
Fisher’s exact test was done for this example, the p
value is 0.5916893.
Empower also
output a graph as: