Although the writer has the right to choose the title and content of his letter, it seems as if the purpose of the letter is not to allow the authors of this article to answer his questions. Rather, it exhibits a sharp judgment, implying the invalidation of our work, which doesn’t seem fair to us.
The study groups were matched according to bone age, which was detected by the method of Greulich and Pyle. This was clearly defined in the fourth paragraph of the “Material and methods” section. In addition, the need to match the groups according to bone age, not chronologic age, was mentioned briefly in the “Discussion” (paragraph 3, p. 183) with supporting references.
The mean and standard deviation values for bone age and chronologic age of each subgroup were given in Table II. Accordingly, the finding of no statistical difference was clearly stated in the first paragraph of the “Results” section (p. 182). Pair-wise comparisons were used to verify that. As P values were >0.05 for pair-wise comparisons and the difference between groups was statistically insignificant, they were not given in the table.
We partially do agree with this criticism. The assumption of normality is especially critical when constructing reference intervals for variables and should be taken seriously to draw accurate and reliable conclusions. However, with large enough sample sizes (>40), the violation of the normality assumption should not cause major problems in parametric tests. It is important to ascertain the statistical tests. Hence, we preferred to use nonparametric tests (Mann-Whitney U test) after having rejected the normality assumption for our data. Using the t test for 2 independent samples yielded the same conclusions of the Mann-Whitney U test for our data. So, with this evidence, we decided to use the Mann-Whitney U test to discuss more reliable results in our article.
We tested the data both in the whole sample and in each subgroup with the Kolmogorov-Smirnov and Shapiro-Wilks tests, and we found that the data were not sampled from a Gaussian population. So we do not agree that assessing the sample’s normality is incorrect.
The criticism about using an ANOVA test design might be accurate, but as is well known, after finding the level of difference, pair-wise comparisons should still be made to understand the reason for that difference. To claim the inaccuracy of pair-wise comparisons might not be true in that sense; rather, it is a matter of choice.
In our article, “matched” means “constructing 2 groups whose data would be comparable with each other.” As it is well known in the orthodontic literature, to see the differences in an affected group (a group with a chronic disease as here), an unaffected control group is needed. The data of these groups are independent from each other, and the presence of 1 group cannot alter the data of the other. So we believe that the use of independent paired comparisons was correct.
We performed the Bonferroni correction for making adjustments for several independent statistical tests and we did for Fisher exact tests simultaneously on our data set. We ran the Bonferroni correction to reduce the chances of obtaining false-positive results (type I errors).
The reader might be right in the sense that the word “interaction” looks like a technical statistical term. However, it was only used to explain the effects of disease, bone age, and sex at the same time.