Power Analysis Using Simulation Method
 Usage of
Empower(R) PowerSimulation Package
Learn me by follow
me exercises
Table of contents
·
Power Simulation for Case
Control Study
o
The
definition of case control study
o
The regression
model with interactions
o
The required
parameters for simulation
·
Power Simulation for
Cohort Study (follow up exposed and nonexposed)
o
The definition
of cohort study
o
The regression
model with interactions
o
The required
parameters for simulation
·
Power Simulation for
CrossSectional Study or Cohort Study (follow up one population)
o
The definition
of crosssectional study
o
The regression
model with interactions
o
The required
parameters for simulation
·
Usage of
Empower(R) PowerSimulation package
1. Use the underlying model to generate
random data with (a) specified sample sizes, (b) parameter values that express
the effect one is trying to detect, and (c) nuisance parameters such as means,
variances, prevalence.
2. Run the regression on the randomly
generated data. Save the parameter estimates and pvalue.
3. Repeat the above steps many times,
say, N. The estimated power is the proportion of runs (out of N) with pvalue less than the specified
significance level alpha.
Power Simulation for Case Control Study
Case control study compares subjects who have a disease or
outcome of interest (cases) with subjects who do not have the disease or
outcome (controls), and looks back retrospectively to compare how frequently
(if the exposure is a dichotomous status, such as smoking status) the exposure
is present in each group, or the level (if the exposure is a continuous
measurement, such as body mass index) of an exposure in each group to determine
the relationship between the risk factor and the disease.
The
outcome Y is the case or control status, which is dichotomous, so logistic
regression model will be applied:
Log(p/(1p)) = β_{0} +
β_{1}*X
p/(1p) is the odds of the outcome
(disease), X is the exposure
If X is dichotomous
(0 or 1), e^{β}1 is
odds ratio of disease compare exposed (X=1) versus nonexposed (X=0); If X is a
continuous (such as age in years), e^{β}1
is odds ratio of disease per one unit change of the exposure.
3.
Regression model with interaction term
of two exposures:
Log(p/(1p)) = β_{0 } +
β_{1}*X_{1 }+
β_{2}*X_{2}+
β_{3}*X_{1}*X_{2}
e^{β}1 is the effect (odds ratio) of X_{1}
at X_{2}=0, which is called the main effect of X_{1}
e^{β}2 is the effect (odds ratio) of X_{2}
at X_{1}=0, which is called the main effect of X_{2}
e^{β}3 reflects the additional effect
(odds ratio) when both X_{1} and X_{2} presents, which is
called the interaction effect of X_{1} and X_{2}
4.
Required parameter for simulation
Model with one dichotomous exposure
(X),
·
Prevalence of X among general population is required.
·
Odds ratio of X, which is the effect to detect, is required
·
Sample size, which is number of cases, and number of
controls, is required
Model with
one continuous exposure (X),
·
Mean and standard
deviation of X among general population is required
·
Odds ratio of X (per unit change), which is the effect to
detect, is required
·
Sample size, which is number of cases, and number of
controls, is required
Model the
interaction of a dichotomous X_{1} and a dichotomous X_{2}
·
Prevalence of X_{1} among general population is
required.
·
Prevalence of X_{2} among general population is
required.
·
Odds ratio of X_{2} with X_{1}, which is the
odds of having X_{2 }(X_{ 2}=1) when X_{1}=1 compare to
the odds when X_{1}=0, is required. If X_{2} is independent
with X_{1}, set this to 1.
·
Main effect (odds ratio) of X_{1} is required
·
Main effect (odds ratio) of X_{2} is required
·
Odds ratio for the interaction term, which is the effect to
detect, is required
·
Sample size, which is number of cases, and number of
controls, is required
Model the
interaction of a dichotomous X_{1} and a continuous X_{2}
·
Prevalence of X_{1} among general population is
required.
·
Mean and standard deviation of X_{2} among general
population is required.
·
Effect of X_{1} on X_{2}, which is the
difference of the mean X_{2} between X_{1}=1 and X_{1}=0,
is required. If X_{2} is independent with X_{1}, set this to 0.
·
Main effect (odds ratio) of X_{1 }is required
·
Main effect (odds ratio) of X_{2} is required
·
Odds ratio for the interaction term, which is the effect to
detect, is required
·
Sample size, which is number of cases, and number of
controls, is required
Power Simulation for Cohort Study
A study in which individuals with exposures to a
suspected factor (exposed) and individuals without exposures to the factor (nonexposed)
are identified and then observed for the occurrence of certain health effects
(such as diseased), or levels change of certain trait measured in continuous
scale (such as systolic blood pressure) over some period. The outcome of
interest are measured and related to estimated exposure levels. Cohort studies
can either be performed prospectively or retrospectively from historical
records.
·
If
the outcome is a dichotomous status (0=no, 1=yes), logistic regression is applied:
Log(p/(1p)) = β_{0} +
β_{1}*X
p/(1p) is the odds of the outcome
(disease), X is the exposure
The effect of X on outcome (Y) is expressed as odds ratio, which is e^{β}1
·
If
the outcome is a continuous measurement, linear regression is applied:
Y = β_{0} + β_{1}*X
The effect of X on Y is expressed as the regression coefficient, which is β_{1}
3.
Regression model with interaction
term of two exposures:
·
If
the outcome is a dichotomous status (0=no, 1=yes), logistic regression is
applied:
Log(p/(1p)) = β_{0 } + β_{1}*X_{1 }+ β_{2}*X_{2}+ β_{3}*X_{1}*X_{2}
e^{β}1 is the effect (odds ratio) of X_{1}
at X_{2}=0, which is called the main effect of X_{1}
e^{β}2 is the effect (odds ratio) of X_{2}
at X_{1}=0, which is called the main effect of X_{2}
e^{β}3 reflects the additional effect
(odds ratio) when both X_{1} and X_{2} presents, which is
called the interaction effect of X_{1} and X_{2}
·
If
the outcome is a continuous measurement, linear regression is applied:
Y = β_{0 } + β_{1}*X_{1 }+ β_{2}*X_{2}+ β_{3}*X_{1}*X_{2}
The effect of X on Y is expressed as regression coefficient,
which is β_{1}
Regression coefficient β_{1}_{
}is the effect of X_{1} at X_{2}=0, which is called the
main effect of X_{1}
Regression coefficient β_{2}_{
}is the effect of X_{2} at X_{1}=0, which is called the
main effect of X_{2}
Regression coefficient β_{3} reflects the additional effect when both X_{1} and X_{2} presents, which is called the interaction effect of X_{1} and X_{2}
4. Required
parameter for simulation
Model dichotomous outcome (Y),
·
Prevalence of Y among nonexposed population is required.
·
Odds ratio of the exposure (X), which is the effect to
detect, is required
·
Sample size, which is number of the nonexposed, and number
of the exposed, is required
Model
continuous outcome (Y),
·
Mean and standard
deviation of Y among nonexposed population is required
·
Regression coefficient of X on Y, which is the effect to
detect, is required
·
Sample size, which is number of the nonexposed, and number
of the exposed, is required
Model
dichotomous outcome (Y), detect the interaction effect
of the exposure under study (which is dichotomous X_{1}) and another
dichotomous risk factor X_{2}
·
Prevalence of Y among nonexposed population is required.
·
Prevalence of X_{2} among general population is
required.
·
Odds ratio of X_{2} with X_{1}, which is the
odds of having X_{2} (X_{2}=1) compare exposed versus
nonexposed, is required. If X_{2} is independent with X_{1},
set this to 1.
·
Main effect (odds ratio) of X_{1} is required
·
Main effect (odds ratio) of X_{2} is required
·
Odds ratio for the interaction term, which is the effect to
detect, is required
·
Sample size, which is number of the nonexposed, and number
of the exposed, is required
Model
dichotomous outcome (Y), detect the interaction effect
of the exposure under study (which is dichotomous X_{1}) and another
continuous risk factor X_{2}
·
Prevalence of Y among nonexposed population is required.
·
Mean and standard deviation of X_{2} among general
population is required.
·
Effect of X_{1} on X_{2}, which is the
difference of the mean X_{2} compare exposed versus nonexposed, is
required. If X_{2} is independent with X_{1}, set this to 0.
·
Main effect (odds ratio) of X_{1 }is required
·
Main effect (odds ratio) of X_{2} is required
·
Odds ratio for the interaction term, which is the effect to
detect, is required
·
Sample size, which is number of the nonexposed, and number
of the exposed, is required
Model
continuous outcome (Y), detect the interaction effect
of the exposure under study (which is dichotomous X_{1}) and another
dichotomous risk factor X_{2}
·
Mean and standard deviation of Y among nonexposed
population is required.
·
Prevalence of X_{2} among general population is
required.
·
Odds ratio of X_{2} with X_{1}, which is the
odds of having X_{2} (X_{2}=1) compare exposed versus
nonexposed, is required. If X_{2} is independent with X_{1},
set this to 1.
·
Main effect (regression coefficient) of X_{1} is
required
·
Main effect (regression coefficient) of X_{2} is
required
·
Regression coefficient for the interaction term, which is
the effect to detect, is required
·
Sample size, which is number of the nonexposed, and number
of the exposed, is required
Model
continuous outcome (Y), detect the interaction effect
of the exposure under study (which is dichotomous X_{1}) and another
continuous risk factor X_{2}
·
Mean and standard deviation of Y among nonexposed
population is required.
·
Mean and standard deviation of X_{2} among general
population is required.
·
Effect of X_{1} on X_{2}, which is the
difference of the mean X_{2} compare exposed versus nonexposed, is
required. If X_{2} is independent with X_{1}, set this to 0.
·
Main effect (regression coefficient) of X_{1 }is
required
·
Main effect (regression coefficient)of X_{2} is
required
·
Regression coefficient for the interaction term, which is
the effect to detect, is required
·
Sample size, which is number of the nonexposed, and number
of the exposed, is required
Power Simulation for Crosssectional
study or a cohort study which follow up one population
1.
Crosssectional study or cohort
study which follow up one population
This kind of study, instead of identify two groups of population (cases
and controls in casecontrol study, or exposed and nonexposed in cohort study),
it just identifies one group of population, collect both exposures and outcomes
data, and then conduct association test to find any relationship between
exposures and outcomes.
·
If
the outcome is a dichotomous status (0=no, 1=yes), logistic regression is
applied:
Log(p/(1p)) = β_{0} +
β_{1}*X
p/(1p) is the odds of the outcome
(Y), X is the exposure
The effect of X on outcome (Y) is expressed as odds ratio, which is e^{β}1
·
If
the outcome is a continuous measurement, linear regression is applied:
Y = β_{0} + β_{1}*X
The effect of X on Y is expressed as regression coefficient, which is β_{1}
3.
Regression model with interaction
term of two exposures:
·
If
the outcome is a dichotomous status (0=no, 1=yes), logistic regression is
applied:
Log(p/(1p)) = β_{0 } + β_{1}*X_{1 }+ β_{2}*X_{2}+ β_{3}*X_{1}*X_{2}
e^{β}1 is the effect (odds ratio) of X_{1}
at X_{2}=0, which is called the main effect of X_{1}
e^{β}2 is the effect (odds ratio) of X_{2}
at X_{1}=0, which is called the main effect of X_{2}
e^{β}3 reflects the additional effect
(odds ratio) when both X1 and X2 presents, which is called the interaction
effect of X_{1} and X_{2}
·
If
the outcome is a continuous measurement, linear regression is applied:
Y = β_{0 } + β_{1}*X_{1 }+ β_{2}*X_{2}+ β_{3}*X_{1}*X_{2}
The effect of X on Y is expressed as regression coefficient,
which is β_{1}
Regression coefficient β_{1}_{
}is the effect of X_{1} at X_{2}=0, which is called the
main effect of X_{1}
Regression coefficient β_{2}_{
}is the effect of X_{2} at X_{1}=0, which is called the
main effect of X_{2}
Regression coefficient β_{3} reflects the additional effect when both X1 and X2 presents, which is called the interaction effect of X_{1} and X_{2}
4.
Required parameter for simulation
Model dichotomous
outcome (Y) with dichotomous exposure (X),
·
Prevalence of Y among general population is required.
·
Prevalence of X among general population is required.
·
Odds ratio of the exposure X on Y, which is the effect to
detect, is required
·
Sample size, the total number subjects, is required
Model dichotomous outcome (Y) with continuous exposure (X),
·
Prevalence of Y among general population is required.
·
Mean and standard deviation of X among general population is
required.
·
Odds ratio of the per unit change of X on Y, which is the
effect to detect, is required
·
Sample size, the total number subjects, is required
Model
continuous outcome (Y) with dichotomous exposure (X),
·
Mean and standard deviation of Y among general population is
required.
·
Prevalence of X among general population is required.
·
Regression coefficient of the exposure X on Y, which is the
effect to detect, is required
·
Sample size, the total number subjects, is required
Model
continuous outcome (Y) with continuous exposure (X),
·
Mean and standard deviation of Y among general population is
required.
·
Mean and standard deviation of X among general population is
required.
·
Regression coefficient of the per unit change of X on Y,
which is the effect to detect, is required
·
Sample size, the total number subjects, is required
Model
dichotomous outcome (Y), detect the interaction effect
of a dichotomous exposure X_{1} and another dichotomous risk factor X_{2}
·
Prevalence of Y among general population is required.
·
Prevalence of X_{1} among general population is
required.
·
Prevalence of X_{2} among general population is
required.
·
Odds ratio of X_{2} with X_{1}, which is the
odds of having X_{2} (X_{2}=1) when X_{1}=1 compare to
the odds when X_{1}=0, is required. If X_{2} is independent
with X_{1}, set this to 1.
·
Main effect (odds ratio) of X_{1} is required
·
Main effect (odds ratio) of X_{2} is required
·
Odds ratio for the interaction term, which is the effect to
detect, is required
·
Sample size, the total number subjects, is required
Model
dichotomous outcome (Y), detect the interaction effect
of a dichotomous exposure X_{1} and another continuous risk factor X_{2}
·
Prevalence of Y among general population is required.
·
Prevalence of X_{1} among general population is
required.
·
Mean and standard deviation of X_{2} among general population
is required.
·
Effect of X_{1} on X_{2}, which is the
difference of the mean X_{2} when X_{1}=1 compare to the mean
when X_{1}=0, is required. If X_{2} is independent with X_{1},
set this to 0.
·
Main effect (odds ratio) of X_{1} is required
·
Main effect (odds ratio) of X_{2} is required
·
Odds ratio for the interaction term, which is the effect to
detect, is required
·
Sample size, the total number subjects, is required
Model
continuous outcome (Y), detect the interaction effect
of a dichotomous exposure X_{1} and another dichotomous risk factor X_{2}
·
Mean and standard deviation of Y among general population is
required.
·
Prevalence of X_{1} among general population is
required.
·
Prevalence of X_{2} among general population is
required.
·
Odds ratio of X_{2} with X_{1}, which is the
odds of having X_{2} (X_{2}=1) when X_{1}=1compare to
the odds when X_{1}=0 nonexposed, is required. If X_{2} is
independent with X_{1}, set this to 1.
·
Main effect (regression coefficient) of X_{1} is
required
·
Main effect (regression coefficient) of X_{2} is
required
·
Regression coefficient for the interaction term, which is
the effect to detect, is required
·
Sample size, the total number subjects, is required
Model
continuous outcome (Y), detect the interaction effect
of a dichotomous exposure X_{1} and another continuous risk factor X_{2}
·
Mean and standard deviation of Y among general population is
required.
·
Prevalence of X_{1} among general population is
required.
·
Mean and standard deviation of X_{2} among general
population is required.
·
Effect of X_{1} on X_{2}, which is the
difference of the mean X_{2} when X_{1}=1 compare to the mean
when X_{1}=0, is required. If X_{2} is independent with X_{1},
set this to 0.
·
Main effect (regression coefficient) of X_{1 }is
required
·
Main effect (regression coefficient) of X_{2} is
required
·
Regression coefficient for the interaction term, which is
the effect to detect, is required
·
Sample size, the total number subjects, is required
Empower(R)
PowerSimulation Package
Empower(R) is statistical data analyzer, which can be
downloaded from http://www.empowerstats.com/
A
sample of input screen shot:
Power by sample size and effect
ia
(OR:1.2) 
ia
(OR:1.5) 
ia
(OR:1.8) 
x1 (OR:1.2) 
x1 (OR:1.5) 
x1 (OR:1.8) 
x2 (OR:1.2) 
x2 (OR:1.5) 
x2 (OR:1.8) 

N.ctrl=200, N.case=200 
0.053 
0.129 
0.215 
0.093 
0.1 
0.1 
0.265 
0.239 
0.241 
N.ctrl=250, N.case=200 
0.051 
0.144 
0.241 
0.105 
0.117 
0.099 
0.293 
0.251 
0.263 
N.ctrl=300, N.case=200 
0.06 
0.155 
0.271 
0.11 
0.122 
0.122 
0.314 
0.276 
0.297 
N.ctrl=350, N.case=200 
0.061 
0.153 
0.298 
0.119 
0.119 
0.134 
0.321 
0.294 
0.304 
N.ctrl=400, N.case=200 
0.062 
0.169 
0.305 
0.125 
0.123 
0.137 
0.335 
0.296 
0.314 
Parameters:
1. Prevalence of exposure X1 among general
population: 0.3
2. Prevalence of exposure X2 among general
population: 0.3
3. Main effect (odds ratio) of Exposure X1:
1.2
4. Main effect (odds ratio) of Exposure X2:
1.4
5. Odds ratio of Exposure X1 with Exposure X2:
1.2
Power simulation by
Empower(R) (www.empowerstats.com) on Thu Jun 23 20:57:45 2011
1. ia:
power for detect the interaction of X1 and X2
2. x1: power for
detect the main effect of X1
3. x2: power for
detect the main effect of X2
Plot of the power to detect the interaction
effect:
Sample size:
·
For
casecontrol study, number of cases and number of controls are required. One can
fix number of controls and then put up to 5 different numbers of cases to get
different power for different sample sizes.
Or fix number of cases and then put up to 5 different numbers of
controls to get different power for different sample sizes.
·
For
cohort study, number of the exposed and number of the nonexposed are required.
One can fix number of the nonexposed and then put up to 5 different numbers of
the exposed to get different power for different sample sizes; or fix number of
the exposed and then put up to 5 different numbers of the nonexposed to get
different power for different sample sizes.
·
For
crosssectional study, one can put up to 5 different number of sample sizes to
get different power for different sample size.
Effect to detect
·
One
can put up to 3 different sizes of effects to calculate the power of detecting
different effects.
The significance level: alpha
·
The
default is 0.05, one can change it by alter the “α =” box
Number of replicates
·
The
default number of runs is 1000. Power is
calculated as out of the 1000 pvalues from 1000 regression on 1000 randomly
generated set of data, how many of the pvalues were less than the alpha.
·
One
can change the number of replicates by change the “replicates” box.
Output
·
Default
output includes the table of power by sample sizes and effect sizes. If 3 or more sample sizes were given, a plot
of power versus samples size will be included in the output file.
·
For
each setting (sample size and effects to detect), a power was calculated. In
output table(s), the columns were marked to indicate what the setting is. For example, in the above sample output, ia: (OR:1.2) represents the power
to detect the interaction (ia) effect of OR=1.2. Besides output the power to detect the
interactions, the power for detect the main effect of X1 and X2 were also
included in the output table. X1: (OR: 1.2) column represents the power to
detect the main effect of X1 at the setting of interaction effect OR=1.2.
·
To
include the regression coefficient and pvalues from each run in output, check
the box “Output simulation model (β and pvalue)”