Thank you for your interest in our work. Our study was not a “trial,” as the letter suggested but, rather, an observational prospective longitudinal study, and thus the requirement for a sample power determination a priori was not a necessity. Nonetheless, it is good practice to determine the sample size if there is a standardized assessment measure of “oral impact.”

To date, no standardized assessment measures of oral impacts of orthodontic appliances exist to guide sample size calculation (based on means and standard deviations, or the prevalence of the impact). In this study, an ad-hoc approach to oral impact assessment was used, considering 13 factors (common oral impacts associated with orthodontic treatment). Significant differences were observed in 9 of the 13 oral impacts between those with labial vs lingual appliances. Because significant differences were observed (across most oral impacts), that provided evidence that the sample size had adequate statistical power; the issue of sample power becomes important when there is no significant difference. Sample size calculation in itself cannot determine whether there are significant differences, as the letter implied.

In our study, differences in the oral impacts experienced were assessed via area-under-curve analyses. If one were to estimate sample size based on the findings of our study, where the mean oral impact among those with lingual appliances was 37.4 (SD, 9.6), and hypothesize a 20% difference in oral impact among those with labial appliances, then a sample size of 60 (30 per group) would have 85.4% power: ie, 85% of studies with a sample of 60 (30 per group) would expect to yield a significant difference, rejecting the null hypothesis that the populations have equal oral impact. It is recommended that, when sample size calculations are conducted, a power of at least 80% should be set.

Although we also support the notion of prestudy sample size calculation, without a standardized assessment measure and evidence from previous studies, all that can be determined is a “guestimate” rather than a sample power estimate. We hope that our clarifications will be useful to those involved in similar studies in the future.

