STATISTICAL APPLICATIONS FOR PATIENT ORIENTED PRIMARY HEALTHCARE RESEARCH

Computing Sample Size for a Cohort Comparison Study

Computing sample size for a prospective multiple cohort comparison

The following webulator can be used to estimate the sample size for a prospective multiple cohort comparison study. The essential elements required for the computation of sample size are described below. The sample size computations are based on Fleiss (1981). The webulator was initially written using SAS with perl and html The SAS program was originally written by Dr. P.N. Corey (University of Toronto) and includes a continuity correction that was not included in Fleiss' original calculations. However, since SAS is a licensed program I rewrote the program in php. Unlike the original SAS program that used a probit function to enable the input of any alpha and beta value, this program uses fixed values for alpha and beta which the user can enter based on the following tables.

Conversion table for alpha and percent confidence to Z
(alpha probability estimate of α=0.10)=90% [Z_alpha] = 1.64	(alpha probability estimate of α=0.09)=91% [Z_alpha] = 1.70
(alpha probability estimate of α=0.08)=92% [Z_alpha] = 1.75	(alpha probability estimate of α=0.07)=93% [Z_alpha] = 1.81
(alpha probability estimate of α=0.06)=94% [Z_alpha] = 1.88	(alpha probability estimate of α=0.05)=95% [Z_alpha] = 1.96
(alpha probability estimate of α=0.04)=96% [Z_alpha] = 2.05	(alpha probability estimate of α=0.03)=97% [Z_alpha] = 2.17
(alpha probability estimate of α=0.02)=98% [Z_alpha] = 2.33	(alpha probability estimate of α=0.01)=99% [Z_alpha] = 2.58

Conversion table to create the Z beta term
(one tailed probability estimate=0.05); beta is β= 0.10 (power = 90%) ; [Z_beta] = 1.64	(one tailed probability estimate=0.06); beta is β= 0.12 (power = 88%) ; [Z_beta] = 1.55
(one tailed probability estimate=0.07); beta is β= 0.14 (power = 86%) ; [Z_beta] = 1.48	(one tailed probability estimate=0.08); beta is β= 0.16 (power = 84%) ; [Z_beta] = 1.41
(one tailed probability estimate=0.09); beta is β= 0.18 (power = 82%) ; [Z_beta] = 1.34	(one tailed probability estimate=0.10); beta is β= 0.20 (power = 80%) ; [Z_beta] = 1.28

The alpha level – also referred to as the level of statistical significance...the value against which the estimated "test statistic" will be compared to determine if there is something happening in the research question under investigation (i.e. the drug worked, the neighbourhoods differed, more symptoms were reported, the light is brighter, the sound was louder, etc.). The alpha level is also referred to as the probability of committing a Type I error (failure to accept the null hypothesis when in fact it is true). The typical value for the alpha level is 0.05 (also written as p<0.05).
The beta level – often associated with statistical power as Power = 1- beta. The beta value is an estimate of the probability associated with making a Type II statistical error (i.e. failure to reject the null hypothesis when in fact it is false).

According to Cohen (described in Fleiss, 1981), given that committing a Type I error is four times as serious as committing a Type II error, a researcher should set the beta value to 4 x alpha. That is, when a researcher states an alpha value of 0.05, the corresponding beta value should be set to 4 x 0.05 = 0.20.

A beta value of 0.20 is therefore an indication of the researcher's willingness to accept a 20% chance of missing an event (i.e.effect) that actually occurred. Considering the concept of power, a beta value of 0.20 represents a statistical power level of 0.80 or 80%. The typical value for the beta level is 0.20.
The ratio – in the computation of sample size for a prospective multiple cohort design the researcher may be faced with cohorts of different sizes. The ratio term refers to the fraction of difference between the cohort of interest and the control or comparison cohorts. In computing sample size for the prospective multiple cohort comparison, the researcher may wish to consider that the group of interest is half as large as the control group, in which case the ratio value is presented as 0.5:1. Similarly, the researcher may consider a ratio of 2:1 or 3:1 for the group of interest and the control group. Likewise, in some situations the researcher may conider ratios as high as 5:1, 10:1, or even 20:1.
P₁: (expected porportion in the group of interest) – in the computation of sample size for a prospective multiple cohort design the researcher may have access to previous research that indicates the expected proportion of outcome for individuals within a given cohort, or the expected proportion of individuals that are considered exposed or present with a given characteristic in a study.

For example, in previous research measuring the epidemiology of injuries in ice hockey, Montelpare, Pelletier and Stark (1996) reported injury rates that ranged from 17% to 68%, with an average proportion of injured among individuals that body check of about 43% (a proportion value of 0.43).

In computing the sample size for a prospective multiple cohort design the researcher should enter a decimal value to respresent the expected proportion (i.e. outcome proportion) in their study.
P₀: (expected porportion in the CONTROL group) – in the computation of sample size for a prospective multiple cohort design the researcher may also have access to previous research that indicates the expected proportion of outcome for individuals within the control group or not exposed or not at risk cohort. If you are unsure about this value then enter 0.50 as this would be considered as an unbiased level of expected exposure or risk. However, the webulator is set up to take any value for P₀ between 0 and 1.

Sample Size Webulator #4 – Prospective Multiple Cohort Comparison
Enter the Alpha value (e.g. 0.05)
Enter the Beta value (e.g. 0.20)
Enter the number that represents the ratio for the "control cohort participants" when compared to the "cohort of interest". In other words enter the number of control group participants for each participant from the group of interest (e.g. enter either 0.5, 2, 3, 5, 10, or 20) In the formulas used by Fleiss this estimate is represented by the letter m
Enter the expected proportion of the event of interest within the subjects at risk in the population – P₁ This term is labelled P₁ and the program will compute a Q₁ value
Enter the expected proportion of the event of interest within the cohorts of not at risk individuals in the population (i.e. the controls) – P₀ This term is labelled P₀ and the program will compute a Q₀ value if the value is unknown enter 0.5.

Click here to return to the Webulator Menu Page

For more information, please contact:

Professor William J. Montelpare, Ph.D.,
Margaret and Wallace McCain Chair in Human Development and Health,
Department of Applied Human Sciences, Faculty of Science,
Health Sciences Building, University of Prince Edward Island,
550 Charlottetown, PE, Canada, C1A 4P3
(o) 902 620 5186

Visiting Professor, School of Healthcare, University of Leeds,
Leeds, UK, LS2 9JT
e-mail wmontelpare@upei.ca
Copyright © 2002--ongoing [University of Prince Edward Island]. All rights reserved.