Power Analysis Using Simulation Method

------ Usage of Empower(R) PowerSimulation Package

Learn me by follow me exercises

 

Table of contents

·         The introduction of simulation method

·         Power Simulation for Case Control Study

o   The definition of case control study

o   The regression model

o   The regression model with interactions

o   The required parameters for simulation

·         Power Simulation for Cohort Study (follow up exposed and non-exposed)

o   The definition of cohort study

o   The regression model

o   The regression model with interactions

o   The required parameters for simulation

·         Power Simulation for Cross-Sectional Study or Cohort Study (follow up one population)

o   The definition of cross-sectional study

o   The regression model

o   The regression model with interactions

o   The required parameters for simulation

·         Usage of Empower(R) PowerSimulation package

o   A sample input screen shot

o   A sample output screen shot

o   Other input options

 

The simulation method:

1.      Use the underlying model to generate random data with (a) specified sample sizes, (b) parameter values that express the effect one is trying to detect, and (c) nuisance parameters such as means, variances, prevalence.

2.      Run the regression on the randomly generated data. Save the parameter estimates and p-value.

3.      Repeat the above steps many times, say, N. The estimated power is the proportion of runs (out of N) with p-value less than the specified significance level alpha.

 

Power Simulation for Case Control Study

1.      Case Control Study

Case control study compares subjects who have a disease or outcome of interest (cases) with subjects who do not have the disease or outcome (controls), and looks back retrospectively to compare how frequently (if the exposure is a dichotomous status, such as smoking status) the exposure is present in each group, or the level (if the exposure is a continuous measurement, such as body mass index) of an exposure in each group to determine the relationship between the risk factor and the disease.

 

2.      The regression model:

 

The outcome Y is the case or control status, which is dichotomous, so logistic regression model will be applied:  

Log(p/(1-p)) = β0 + β1*X

p/(1-p) is the odds of the outcome (disease), X is the exposure

If X is dichotomous (0 or 1), eβ1 is odds ratio of disease compare exposed (X=1) versus non-exposed (X=0); If X is a continuous (such as age in years), eβ1 is odds ratio of disease per one unit change of the exposure.

 

3.      Regression model with interaction term of two exposures:

 

Log(p/(1-p)) = β0  +  β1*X1 +  β2*X2+  β3*X1*X2

eβ1 is the effect (odds ratio) of X1 at X2=0, which is called the main effect of X1

eβ2 is the effect (odds ratio) of X2 at X1=0, which is called the main effect of X2

eβ3 reflects the additional effect (odds ratio) when both X1 and X2 presents, which is called the interaction effect of X1 and X2

 

4.      Required parameter for simulation

 

Model with one dichotomous exposure (X),

·         Prevalence of X among general population is required.

·         Odds ratio of X, which is the effect to detect, is required

·         Sample size, which is number of cases, and number of controls, is required

 

Model with one continuous exposure (X),

·         Mean and standard  deviation of X among general population is required

·         Odds ratio of X (per unit change), which is the effect to detect, is required

·         Sample size, which is number of cases, and number of controls, is required

 

Model the interaction of a dichotomous X1 and a dichotomous X2

·         Prevalence of X1 among general population is required.

·         Prevalence of X2 among general population is required.

·         Odds ratio of X2 with X1, which is the odds of having X2 (X 2=1) when X1=1 compare to the odds when X1=0, is required. If X2 is independent with X1, set this to 1.

·         Main effect (odds ratio) of X1 is required

·         Main effect (odds ratio) of X2 is required

·         Odds ratio for the interaction term, which is the effect to detect, is required

·         Sample size, which is number of cases, and number of controls, is required

 

Model the interaction of a dichotomous X1 and a continuous X2

·         Prevalence of X1 among general population is required.

·         Mean and standard deviation of X2 among general population is required.

·         Effect of X1 on X2, which is the difference of the mean X2 between X1=1 and X1=0, is required. If X2 is independent with X1, set this to 0.

·         Main effect (odds ratio) of X1 is required

·         Main effect (odds ratio) of X2 is required

·         Odds ratio for the interaction term, which is the effect to detect, is required

·         Sample size, which is number of cases, and number of controls, is required

 

Power Simulation for Cohort Study

1.      Cohort study

 

A study in which individuals with exposures to a suspected factor (exposed) and individuals without exposures to the factor (non-exposed) are identified and then observed for the occurrence of certain health effects (such as diseased), or levels change of certain trait measured in continuous scale (such as systolic blood pressure) over some period. The outcome of interest are measured and related to estimated exposure levels. Cohort studies can either be performed prospectively or retrospectively from historical records.

 

2.      The regression model:  

 

·         If the outcome is a dichotomous status (0=no, 1=yes), logistic regression is applied:

Log(p/(1-p)) = β0 + β1*X

p/(1-p) is the odds of the outcome (disease), X is the exposure

The effect of X on outcome (Y) is expressed as odds ratio, which is eβ1

·         If the outcome is a continuous measurement, linear regression is applied:

Y = β0 + β1*X

The effect of X on Y is expressed as the regression coefficient, which is β1

 

3.      Regression model with interaction term of two exposures:

 

·         If the outcome is a dichotomous status (0=no, 1=yes), logistic regression is applied:

 

Log(p/(1-p)) = β0  +  β1*X1 +  β2*X2+  β3*X1*X2

eβ1 is the effect (odds ratio) of X1 at X2=0, which is called the main effect of X1

eβ2 is the effect (odds ratio) of X2 at X1=0, which is called the main effect of X2

eβ3 reflects the additional effect (odds ratio) when both X1 and X2 presents, which is called the interaction effect of X1 and X2

 

·         If the outcome is a continuous measurement, linear regression is applied:

 

Y = β0  +  β1*X1 +  β2*X2+  β3*X1*X2

The effect of X on Y is expressed as regression coefficient, which is β1

Regression coefficient β1 is the effect of X1 at X2=0, which is called the main effect of X1

Regression coefficient β2 is the effect of X2 at X1=0, which is called the main effect of X2

Regression coefficient β3 reflects the additional effect when both X1 and X2 presents, which is called the interaction effect of X1 and X2

 

4.      Required parameter for simulation

Model dichotomous outcome (Y),

·         Prevalence of Y among non-exposed population is required.

·         Odds ratio of the exposure (X), which is the effect to detect, is required

·         Sample size, which is number of the non-exposed, and number of the exposed, is required

 

Model continuous outcome (Y),

 

·         Mean and standard  deviation of Y among non-exposed population is required

·         Regression coefficient of X on Y, which is the effect to detect, is required

·         Sample size, which is number of the non-exposed, and number of the exposed, is required

 

Model dichotomous outcome (Y), detect the interaction effect of the exposure under study (which is dichotomous X1) and another dichotomous risk factor X2

 

·         Prevalence of Y among non-exposed population is required.

·         Prevalence of X2 among general population is required.

·         Odds ratio of X2 with X1, which is the odds of having X2 (X2=1) compare exposed versus non-exposed, is required. If X2 is independent with X1, set this to 1.

·         Main effect (odds ratio) of X1 is required

·         Main effect (odds ratio) of X2 is required

·         Odds ratio for the interaction term, which is the effect to detect, is required

·         Sample size, which is number of the non-exposed, and number of the exposed, is required

 

Model dichotomous outcome (Y), detect the interaction effect of the exposure under study (which is dichotomous X1) and another continuous risk factor X2

 

·         Prevalence of Y among non-exposed population is required.

·         Mean and standard deviation of X2 among general population is required.

·         Effect of X1 on X2, which is the difference of the mean X2 compare exposed versus non-exposed, is required. If X2 is independent with X1, set this to 0.

·         Main effect (odds ratio) of X1 is required

·         Main effect (odds ratio) of X2 is required

·         Odds ratio for the interaction term, which is the effect to detect, is required

·         Sample size, which is number of the non-exposed, and number of the exposed, is required

 

Model continuous outcome (Y), detect the interaction effect of the exposure under study (which is dichotomous X1) and another dichotomous risk factor X2

 

·         Mean and standard deviation of Y among non-exposed population is required.

·         Prevalence of X2 among general population is required.

·         Odds ratio of X2 with X1, which is the odds of having X2 (X2=1) compare exposed versus non-exposed, is required. If X2 is independent with X1, set this to 1.

·         Main effect (regression coefficient) of X1 is required

·         Main effect (regression coefficient) of X2 is required

·         Regression coefficient for the interaction term, which is the effect to detect, is required

·         Sample size, which is number of the non-exposed, and number of the exposed, is required

 

Model continuous outcome (Y), detect the interaction effect of the exposure under study (which is dichotomous X1) and another continuous risk factor X2

 

·         Mean and standard deviation of Y among non-exposed population is required.

·         Mean and standard deviation of X2 among general population is required.

·         Effect of X1 on X2, which is the difference of the mean X2 compare exposed versus non-exposed, is required. If X2 is independent with X1, set this to 0.

·         Main effect (regression coefficient) of X1 is required

·         Main effect (regression coefficient)of X2 is required

·         Regression coefficient for the interaction term, which is the effect to detect, is required

·         Sample size, which is number of the non-exposed, and number of the exposed, is required

 

Power Simulation for Cross-sectional study or a cohort study which follow up one population

 

1.      Cross-sectional study or cohort study which follow up one population

 

This kind of study, instead of identify two groups of population (cases and controls in case-control study, or exposed and non-exposed in cohort study), it just identifies one group of population, collect both exposures and outcomes data, and then conduct association test to find any relationship between exposures and outcomes.

 

2.      The regression model:  

 

·         If the outcome is a dichotomous status (0=no, 1=yes), logistic regression is applied:

Log(p/(1-p)) = β0 + β1*X

p/(1-p) is the odds of the outcome (Y), X is the exposure

The effect of X on outcome (Y) is expressed as odds ratio, which is eβ1

·         If the outcome is a continuous measurement, linear regression is applied:

Y = β0 + β1*X

The effect of X on Y is expressed as regression coefficient, which is β1

 

3.      Regression model with interaction term of two exposures:

 

·         If the outcome is a dichotomous status (0=no, 1=yes), logistic regression is applied:

Log(p/(1-p)) = β0  +  β1*X1 +  β2*X2+  β3*X1*X2

eβ1 is the effect (odds ratio) of X1 at X2=0, which is called the main effect of X1

eβ2 is the effect (odds ratio) of X2 at X1=0, which is called the main effect of X2

eβ3 reflects the additional effect (odds ratio) when both X1 and X2 presents, which is called the interaction effect of X1 and X2

·         If the outcome is a continuous measurement, linear regression is applied:

Y = β0  +  β1*X1 +  β2*X2+  β3*X1*X2

The effect of X on Y is expressed as regression coefficient, which is β1

Regression coefficient β1 is the effect of X1 at X2=0, which is called the main effect of X1

Regression coefficient β2 is the effect of X2 at X1=0, which is called the main effect of X2

Regression coefficient β3 reflects the additional effect when both X1 and X2 presents, which is called the interaction effect of X1 and X2

 

4.      Required parameter for simulation

 

Model dichotomous outcome (Y) with dichotomous exposure (X),

·         Prevalence of Y among general population is required.

·         Prevalence of X among general population is required.

·         Odds ratio of the exposure X on Y, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

Model dichotomous outcome (Y) with continuous exposure (X),

·         Prevalence of Y among general population is required.

·         Mean and standard deviation of X among general population is required.

·         Odds ratio of the per unit change of X on Y, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

Model continuous outcome (Y) with dichotomous exposure (X),

 

·         Mean and standard deviation of Y among general population is required.

·         Prevalence of X among general population is required.

·         Regression coefficient of the exposure X on Y, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

Model continuous outcome (Y) with continuous exposure (X),

 

·         Mean and standard deviation of Y among general population is required.

·         Mean and standard deviation of X among general population is required.

·         Regression coefficient of the per unit change of X on Y, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

Model dichotomous outcome (Y), detect the interaction effect of a dichotomous exposure X1 and another dichotomous risk factor X2

 

·         Prevalence of Y among general population is required.

·         Prevalence of X1 among general population is required.

·         Prevalence of X2 among general population is required.

·         Odds ratio of X2 with X1, which is the odds of having X2 (X2=1) when X1=1 compare to the odds when X1=0, is required. If X2 is independent with X1, set this to 1.

·         Main effect (odds ratio) of X1 is required

·         Main effect (odds ratio) of X2 is required

·         Odds ratio for the interaction term, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

Model dichotomous outcome (Y), detect the interaction effect of a dichotomous exposure X1 and another continuous risk factor X2

 

·         Prevalence of Y among general population is required.

·         Prevalence of X1 among general population is required.

·         Mean and standard deviation of X2 among general population is required.

·         Effect of X1 on X2, which is the difference of the mean X2 when X1=1 compare to the mean when X1=0, is required. If X2 is independent with X1, set this to 0.

·         Main effect (odds ratio) of X1 is required

·         Main effect (odds ratio) of X2 is required

·         Odds ratio for the interaction term, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

Model continuous outcome (Y), detect the interaction effect of a dichotomous exposure X1 and another dichotomous risk factor X2

 

·         Mean and standard deviation of Y among general population is required.

·         Prevalence of X1 among general population is required.

·         Prevalence of X2 among general population is required.

·         Odds ratio of X2 with X1, which is the odds of having X2 (X2=1) when X1=1compare to the odds when X1=0 non-exposed, is required. If X2 is independent with X1, set this to 1.

·         Main effect (regression coefficient) of X1 is required

·         Main effect (regression coefficient) of X2 is required

·         Regression coefficient for the interaction term, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

Model continuous outcome (Y), detect the interaction effect of a dichotomous exposure X1 and another continuous risk factor X2

 

·         Mean and standard deviation of Y among general population is required.

·         Prevalence of X1 among general population is required.

·         Mean and standard deviation of X2 among general population is required.

·         Effect of X1 on X2, which is the difference of the mean X2 when X1=1 compare to the mean when X1=0, is required. If X2 is independent with X1, set this to 0.

·         Main effect (regression coefficient) of X1 is required

·         Main effect (regression coefficient) of X2 is required

·         Regression coefficient for the interaction term, which is the effect to detect, is required

·         Sample size, the total number subjects, is required

 

 

Empower(R) PowerSimulation Package

 

Empower(R) is statistical data analyzer, which can be downloaded from http://www.empowerstats.com/

A sample of input screen shot:

 

A sample output screen shot

Power by sample size and effect 

ia (OR:1.2)

ia (OR:1.5)

ia (OR:1.8)

x1 (OR:1.2)

x1 (OR:1.5)

x1 (OR:1.8)

x2 (OR:1.2)

x2 (OR:1.5)

x2 (OR:1.8)

N.ctrl=200, N.case=200

0.053

0.129

0.215

0.093

0.1

0.1

0.265

0.239

0.241

N.ctrl=250, N.case=200

0.051

0.144

0.241

0.105

0.117

0.099

0.293

0.251

0.263

N.ctrl=300, N.case=200

0.06

0.155

0.271

0.11

0.122

0.122

0.314

0.276

0.297

N.ctrl=350, N.case=200

0.061

0.153

0.298

0.119

0.119

0.134

0.321

0.294

0.304

N.ctrl=400, N.case=200

0.062

0.169

0.305

0.125

0.123

0.137

0.335

0.296

0.314

Parameters: 
 1. Prevalence of exposure X1 among general population: 0.3 
 2. Prevalence of exposure X2 among general population: 0.3 
 3. Main effect (odds ratio) of Exposure X1: 1.2 
 4. Main effect (odds ratio) of Exposure X2: 1.4 
 5. Odds ratio of Exposure X1 with Exposure X2: 1.2 
 Power simulation by Empower(R) (www.empowerstats.com) on Thu Jun 23 20:57:45 2011 
 1. ia: power for detect the interaction of X1 and X2 
 2. x1: power for detect the main effect of X1 
 3. x2: power for detect the main effect of X2 

 

Plot of the power to detect the interaction effect:

Other options

 

Sample size:

 

·         For case-control study, number of cases and number of controls are required. One can fix number of controls and then put up to 5 different numbers of cases to get different power for different sample sizes.  Or fix number of cases and then put up to 5 different numbers of controls to get different power for different sample sizes.

 

·         For cohort study, number of the exposed and number of the non-exposed are required. One can fix number of the non-exposed and then put up to 5 different numbers of the exposed to get different power for different sample sizes; or fix number of the exposed and then put up to 5 different numbers of the non-exposed to get different power for different sample sizes.

 

·         For cross-sectional study, one can put up to 5 different number of sample sizes to get different power for different sample size.

 

Effect to detect

 

·         One can put up to 3 different sizes of effects to calculate the power of detecting different effects.

 

The significance level: alpha

·         The default is 0.05, one can change it by alter the “α =” box

 

Number of replicates

·         The default number of runs is 1000.  Power is calculated as out of the 1000 p-values from 1000 regression on 1000 randomly generated set of data, how many of the p-values were less than the alpha.

·         One can change the number of replicates by change the “replicates” box.

 

Output

·         Default output includes the table of power by sample sizes and effect sizes.  If 3 or more sample sizes were given, a plot of power versus samples size will be included in the output file.

·         For each setting (sample size and effects to detect), a power was calculated. In output table(s), the columns were marked to indicate what the setting is.  For example, in the above sample output, ia: (OR:1.2) represents the power to detect the interaction (ia) effect of OR=1.2.  Besides output the power to detect the interactions, the power for detect the main effect of X1 and X2 were also included in the output table. X1: (OR: 1.2) column represents the power to detect the main effect of X1 at the setting of interaction effect OR=1.2.

·         To include the regression coefficient and p-values from each run in output, check the box “Output simulation model (β and p-value)”