Analysis of Variance (ANOVA)

 

Analysis of variance (ANOVA) is a common used method for analyzing continuous variables, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation.

In its simplest form ANOVA provides a statistical test of whether or not the means of several groups are all equal, and therefore generalizes t-test to more than two groups. Doing multiple two-sample t-tests would result in an increased chance of committing a type I error. For this reason, ANOVAs are useful in comparing two, three or more means.

For example, A study measured participants blood pressure, all participants were divided into 4 groups based on their body mass index (Q1-Q4, Q1 represents the lowest 25% people, Q4 represents the top 25% high BMI people).  The ANOVA answers whether the total variance of systolic blood pressure could be explained partially by BMI groups, in other words, if different BMI groups had different systolic blood pressures.

ANOVA test requires a continuous variable, a group variable.  A second group variable can also be specified.

In following example, systolic blood pressure (SBP) was analyzed against BMI quartiles and SEX.

Below is the sample input window

 

 

 

Below is the sample output and explanation of the above model:

 

Analysis of Variance

              

(1)    ANOVA with BMI.Q4 and SEX as the explain nary factors 

 

 ANOVA for: SBP

                Df Sum Sq Mean Sq F value   Pr(>F)   

factor(BMI.Q4)   3    948  316.05  3.7643  0.01067 * 

factor(SEX)      1   1773 1773.45 21.1226 5.18e-06 ***

Residuals      648  54406   83.96                    

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 

(2)    Tukey multiple comparisons of means (adjusted for multiple comparison)

 

5 observations deleted due to missingness

  Tukey multiple comparisons of means

    95% family-wise confidence level

 

Fit: aov(formula = SBP ~ factor(BMI.Q4) + factor(SEX))

 

$`factor(BMI.Q4)`

         diff        lwr      upr     p adj

1-0 0.7819311 -1.8325533 3.396416 0.8678330

2-0 1.4285073 -1.1818897 4.038904 0.4937959

3-0 3.2484848  0.6500897 5.846880 0.0073405

2-1 0.6465762 -1.9798366 3.272989 0.9210102

3-1 2.4665537 -0.1479307 5.081038 0.0725308

3-2 1.8199776 -0.7904195 4.430375 0.2762500

 

$`factor(SEX)`

         diff      lwr       upr   p adj

2-1 -3.260532 -4.66978 -1.851285 6.6e-06

 

(3)ANOVA with the SEX and BMI.Q4 joint groups. There are 4 BMI.Q4 groups and 2 SEX group, total have 8 joint groups.

                                

 Two way factorial design MANOVA

                            Df Sum Sq Mean Sq F value    Pr(>F)   

factor(BMI.Q4)               3    948  316.05  3.7514   0.01086 * 

factor(SEX)                  1   1773 1773.45 21.0498 5.378e-06 ***

factor(BMI.Q4):factor(SEX)   3     65   21.57  0.2561   0.85702   

Residuals                  645  54341   84.25                     

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

5 observations deleted due to missingness

  Tukey multiple comparisons of means

    95% family-wise confidence level

 

Fit: aov(formula = SBP ~ factor(BMI.Q4) * factor(SEX))

 

$`factor(BMI.Q4)`

         diff        lwr      upr     p adj

1-0 0.7819311 -1.8370989 3.400961 0.8684187

2-0 1.4285073 -1.1864282 4.043443 0.4953204

3-0 3.2484848  0.6455721 5.851398 0.0074781

2-1 0.6465762 -1.9844029 3.277555 0.9213781

3-1 2.4665537 -0.1524763 5.085584 0.0732894

3-2 1.8199776 -0.7949580 4.434913 0.2777422

 

$`factor(SEX)`

         diff       lwr      upr   p adj

2-1 -3.260532 -4.672225 -1.84884 6.9e-06

 

(4)    Below are Tukey multiple comparisons of means for any pair of the 8 joint goups.

 

$`factor(BMI.Q4):factor(SEX)`

                diff         lwr         upr     p adj

1:1-0:1  0.658280410  -3.4916091  4.80816987 0.9997326

2:1-0:1  1.775362319  -2.6697285  6.22045313 0.9275962

3:1-0:1  2.898097826  -1.6451304  7.44132609 0.5239076

0:2-0:1 -3.894282311  -8.2692292  0.48066460 0.1223193

1:2-0:1 -2.917874396  -7.3097287  1.47397987 0.4688467

2:2-0:1 -1.830060776  -5.9343375  2.27421595 0.8766407

3:2-0:1  0.655832975  -3.3667982  4.67846413 0.9996791

2:1-1:1  1.117081908  -3.3600012  5.59416499 0.9950181

3:1-1:1  2.239817416  -2.3347168  6.81435167 0.8133780

0:2-1:1 -4.552562721  -8.9600111 -0.14511436 0.0371803

1:2-1:1 -3.576154806  -8.0003863  0.84807671 0.2156607

2:2-1:1 -2.488341186  -6.6272454  1.65056302 0.6009814

3:2-1:1 -0.002447436  -4.0604028  4.05550796 1.0000000

3:1-2:1  1.122735507  -3.7211899  5.96666091 0.9968547

0:2-2:1 -5.669644630 -10.3560986 -0.98319066 0.0061617

1:2-2:1 -4.693236715  -9.3954781  0.00900469 0.0508492

2:2-2:1 -3.605423095  -8.0402599  0.82941374 0.2092943

3:2-2:1 -1.119529344  -5.4789160  3.23985728 0.9940497

0:2-3:1 -6.792380137 -11.5720184 -2.01274192 0.0004742

1:2-3:1 -5.815972222 -10.6110911 -1.02085335 0.0059530

2:2-3:1 -4.728158602  -9.2613549 -0.19496232 0.0338709

3:2-3:1 -2.242264851  -6.7016754  2.21714570 0.7915941

1:2-0:2  0.976407915  -3.6595820  5.61239782 0.9982885

2:2-0:2  2.064221535  -2.3003066  6.42874968 0.8391809

3:2-0:2  4.550115286   0.2622744  8.83795622 0.0285525

2:2-1:2  1.087813620  -3.2936621  5.46928933 0.9951688

3:2-1:2  3.573707371  -0.7313830  7.87879777 0.1870209

3:2-2:2  2.485893751  -1.5254037  6.49719115 0.5621058

 

 

 

Empower also output a graph as:

 

(1)    Bar plot and box plot to show the means and distribution for each joint group.

 

 

 

(2)    Two ways interaction plot showing the interaction of BMI quartiles and sex