In the previous article, we introduced the concepts of power and type I and type II errors and gave an example of the required steps for sample-size calculations for comparing 2 means. In this article, we will perform a sample calculation for comparison of 2 proportions. Let us briefly remind ourselves of the information we need before we proceed with the example:
The research question.
The principal outcome measure of the trial.
π 1 , the anticipated proportion on the standard or control treatment.
π 2 , the anticipated proportion on the alternative treatment and hence the minimum clinically important difference (π 2 – π 1 ) between treatment arms that we would like to detect.
The degree of certainty that we want to detect the treatment difference (power) and the level of significance (type I error).
The required steps are the same as for sample-size calculations for comparing 2 means except that, when we use proportions, we do not need a standard deviation, and we use a different formula.
We are interested in conducting a trial in which we will compare overall lingual retainer failure over a 24-month period between retainers bonded with conventional acid etching vs self-etching primers. The sequence of constructing this study will be as follows.
We must decide on what is an acceptable difference to be observed that has clinical importance. Selecting a difference to observe that is too small and not clinically important will increase the required sample size to impractical levels. Let us assume that we consider a difference in the proportion of failures of 15% an important clinical difference and that if self-etching primers, which are less moisture sensitive, can achieve this reduction in failures over 24 months, they might be worth a second look.
We must make assumptions regarding the expected proportion of failures in the control arm, which will be the conventional acid etching group in this example. Two sources that could help us determine the expected proportion of failures in the control arm could be previous published studies or a pilot of the trial that we are designing. Lie Sam Foek et al found that the proportion of failures was around 38% for the entire follow-up period with conventional etching. If we want to detect an absolute 15 percentage points of reduction in failures in the self-etching group, we will assume 23% of total failures over the entire follow-up period.
We decide on an alpha level of 0.05, or 5%, and a power of 0.90, or 90%.
So far, we have the proportion of failures in the control arm (38%), the minimum difference to be detected (15% less), and the desired significance and power levels. To carry out this calculation, we will use the formula described by Pocock, which assumes independently distributed outcomes, equal numbers of participants per arm, no losses to follow-up, and no continuity correction.