Authors' response

We were pleased that our study, “Adverse effects of lingual and buccal orthodontic techniques: A systematic review and meta-analysis” (Ata-Ali F, Ata-Ali J, Ferrer-Molina M, Cobo T, De Carlos F, Cobo J. Am J Orthod Dentofacial Orthop 2016;149:820-9), has been of interest to you. We wish to thank you for your comments and for allowing us to clarify some of the issues raised.

We agree with Lombardo et al that the results of their study could appear complicated and somewhat contradictory. Furthermore, we agree that for obvious reasons, a systematic review cannot detail all aspects of a single study.

With regard to the second part of the letter, we wish to point out that some points need further consideration or correction in the study. First, the sample size per group was small; this has implications in terms of type I and type II errors. The authors decided to use a long series of classical nonparametric tests, but this was not a strictly correct strategy. Such tests are not robust when used to analyze data exhibiting heteroscedasticity. Tables 3 and 4 show some standard deviations to be very different between compared groups. The most extreme case corresponds to the salivary flow rate at T2, with SD = 0.46 for the lingual and SD = 1.22 for the labial group. This actually suggests different levels of variability in both groups.

On the other hand, classical nonparametric tests are only suitable in the context of a simple, 1-way approach, and not for factorial designs involving interactions. The design proposed by the authors is clearly factorial (involving a between-subjects factor [technique] and an intrasubjects factor [time]). A Brunner-Langer model for longitudinal data therefore would be the most suitable statistical method, providing 3 P values for technique, time, and interaction effects. This last effect is crucial to the investigation: Does the plaque index (or gingival bleeding or saliva flow) evolve with the same pattern in both techniques over time? Instead of this approach, the authors used a lot of statistical tests, propagating and not controlling type I error. (Was it taken into account? Was a Bonferroni correction or similar statistic applied?)

Since outcome is measured with different scales (continuous, ordinal), the Brunner-Langer approach would be ideal, since it is valid in application to each of them.

Last, it is more powerful than the tests used in this study. It is very difficult to establish significant differences when comparing 10 patients with 10 patients with the Fisher exact test.

