The aim of this systematic review was to evaluate the treatment effects on maxillary growth of removable functional appliances that advance the mandible to a more forward position in patients with Class II malocclusion.
Sixteen electronic databases and reference lists of studies were searched up to April 2015. Only randomized clinical trials and prospective controlled clinical trials investigating Class II growing patients treated with removable functional appliances were included. Two authors independently accomplished study selection, data extraction, and risk of bias assessment. All pooled analyses of data were based on random-effects models. Statistical heterogeneity was evaluated.
In total, 14 studies were included (5 randomized clinical trials, 9 prospective controlled clinical trials) that collected data from 765 patients (405 treated, 360 untreated controls). The mean differences in treatment effect of functional appliances, relative to the untreated controls, were −0.61° per year (95% CI, −0.69° to −0.25°) for SNA angle, −0.61 mm per year (95% CI, −0.90 to −0.32 mm) for anterior maxillary displacement, and +0.07° per year (95% CI, −0.17° to +0.32°) for maxillary plane rotation.
Removable functional appliances in Class II growing patients have a slight inhibitory effect on the sagittal growth of the maxilla in the short term, but they do not seem to affect rotation of the maxillary plane.
Patients treated with functional appliances exhibit a slight maxillary inhibition.
Removable functional appliances do not affect the maxillary plane orientation.
Randomized controlled trials are needed on this topic to improve the quality of evidence.
Class II malocclusion is the most prevalent sagittal skeletal discrepancy. Different skeletal characteristics can contribute to the formation of a Class II malocclusion : mandibular skeletal retrusion, sagittal maxillary hyperplasia, or a posterior position of the glenoid fossa.
Removable functional appliances (RFAs) are used in growing children to correct a Class II malocclusion by advancing the mandible forward, thus stimulating sagittal mandibular growth. Some randomized clinical trials (RCTs) and 3 meta-analyses including prospective studies showed that functional appliances can increase mandibular growth in a statistically significant manner. Nevertheless, this increment seems insufficient to determine a clinically significant effect on the skeletal Class II resolution.
However, the effects of functional appliances are not limited to the mandible. In many cases, the clinical trials performed to evaluate the efficacy of functional appliances reported, as a secondary outcome, the effect of these appliances on the maxilla. A number of controlled clinical trials have shown that functional appliances significantly restrict anterior maxillary growth (−1.5 mm) ; other trials showed a clinically nonsignificant inhibition (−0.16 mm), whereas some trials showed sagittal growth stimulation (+0.25 mm). It is important to elucidate the effects of functional appliances on sagittal maxillary growth because an inhibition effect on the maxilla could positively contribute to the clinical correction of the Class II discrepancy.
So far, various meta-analyses have been performed to evaluate the effect of functional appliances in Class II patients. However, no meta-analysis has specifically assessed the effects of RFAs on sagittal maxillary growth.
The aim of this systematic review and meta-analysis was to evaluate, with an evidence-based approach, the current bibliographic data from published RCTs and prospective controlled clinical trials (pCCTs) investigating, in the short term, the treatment effects on maxillary growth of RFAs that advance the mandible to a more forward position in patients with Class II malocclusion.
Material and methods
This systematic review and meta-analysis was conducted according to the guidelines of the Cochrane Handbook for Systematic Reviews of Interventions (version 5.1.0) and is reported according to the PRISMA statement. A survey of articles published up to April 2015 about the effects of functional appliances for the treatment of Class II malocclusion was performed by means of several electronic databases ( Supplementary Table I ). Gray literature in electronic databases for conference abstracts, dissertations, and unpublished literature was also searched; in addition, we inquired about the current status of conference proceedings. An appropriately adjusted search strategy for each database was performed ( Supplementary Table I ). The reference lists of the articles eligible for inclusion were also manually reviewed. Systematic reviews and meta-analyses on this subject were also identified and their reference lists scanned for additional trials.
The research strategy was not restricted by language, publication year, or status. Translations into English were arranged for 3 articles (from Turkish, Chinese, and German ). Any specific data not provided in the article were obtained from the authors.
The review protocol of this meta-analysis was registered in the National Institute of Health Research database ( www.crd.york.ac.uk/PROSPERO Protocol: CRD42013006135).
Selection of studies
The eligible studies included RCTs and pCCTs with the following characteristics reported below according to the PICO format: related human clinical trials, studies of growing patients with Class II malocclusion with no craniofacial deformity (population), orthodontic treatment conducted with a Class II RFA (intervention), a comparable untreated control group (control group), results analyzed by lateral cephalometric analyses before and immediately after treatment (outcomes measured), analyzed treatment effects that were not confounded by additional and concomitant procedures (orthognathic surgery, extractions, fixed appliances, and so on), and lateral cephalometric maxillary measurements.
Duplicate reports were excluded, and articles reporting interim outcomes or updates were considered only once. Articles were excluded if they did not meet the inclusion criteria, did not relate to this topic, or were related but had a different aim. Abstracts, laboratory studies, descriptive studies, individual case reports, series of cases, reviews, studies of adults, retrospective longitudinal studies, and meta-analyses were also excluded. Furthermore, RCTs and pCCTs that included patients with the following characteristics were excluded: previous or concomitant treatment for Class II malocclusion, congenital syndromes, periodontal diseases, orofacial inflammatory conditions, and tooth agenesis.
Two review authors (A.L.G. and L.R.) screened all titles and abstracts retrieved from the database searches. They reviewed the full texts of the potentially relevant articles and abstracts. The eligibility of the trials was assessed independently, and any disagreement was resolved after consulting another author (R.N.). The level of agreement between the 2 reviewers was assessed by Cohen kappa statistics.
Two authors (A.L.G. and L.R.) independently extracted study characteristics (study design, type of appliance, sample size, age, sex, setting, appliance features, observation period, time of daily appliance wear, evaluated cephalometric parameters, and follow-up) and outcomes from the selected studies using predefined piloted data extraction forms. Any disagreements were resolved by discussion with another author (R.N.). Cohen kappa statistics were used to assess the agreement between the 2 authors. For the evaluation of maxillary sagittal dimensions, the SNA angle was registered along with all linear cephalometric parameters that evaluated the horizontal displacement of the anterior maxilla compared with a vertical reference line of the cranial base (A-point to OLp [line perpendicular to the occlusal line through sella point], A-point to N perpendicular, Ss [subspinale, the deepest point on the anterior contour of the maxillary alveolar projection determined by a tangent perpendicular to occlusal line] to OLp, ANS, and A-point horizontal displacement).
For the evaluation of maxillary plane rotation, the following angular cephalometric measurements were registered as outcomes: Frankfort horizontal/maxillary plane, sella-nasion/maxillary plane, and nasion-sella line/NL (nasal line).
Assessment of risk of bias
Two authors (G.M. and R.N.) independently performed a qualitative evaluation to assess the risk of bias of the included RCTs using the Cochrane Collaboration’s tool for assessing risk of bias with the software Review Manager (version 5.2; Nordic Cochrane Centre, Cochrane Collaboration, Copenhagen, Denmark, 2012) as guided by the Cochrane Handbook for Systematic Reviews of Interventions . Any disagreement on the risk of bias assessment was resolved after consulting another author (G.C.). The level of agreement between the 2 investigators was assessed with Cohen’s kappa statistics. For each article, the following domains were examined: (1) sequence generation; (2) allocation concealment; (3) blinding of participants, personnel, and outcome assessors; (4) incomplete outcome data; (5) selective outcome reporting; and (6) other sources of bias. The risk of bias for each domain was judged as low, high, or unclear risk. Each RCT was assigned an overall risk of bias rating: low risk (low for all key domains), high risk (high for ≥1 key domain), or unclear risk (unclear for ≥1 key domain).
The risk of bias of the included nonrandomized pCCTs was independently assessed by 2 authors (G.M. and R.N.) using the Downs and Black scale, as suggested by the Cochrane Handbook for Systematic Reviews of Interventions . The Downs and Black scale consisted of 27 questions evaluating (1) reporting, 10 questions; (2) external validity, 3 questions; (3) internal validity or bias, 7 questions; (4) internal validity or confounding or selection bias, 6 questions; and (5) power, 1 question. According to this scale, answers were scored from 0 to 1 point, except for 1 item in the reporting domain (question number 5) that was scored from 0 to 2 points. In the original method of the Downs and Black scale, the last question was scored from 0 to 5 points; we simplified the assessment of this question by scoring this answer at 0 or 5 points, giving 5 points for a preliminary power analysis calculation. Consequently, the total maximum score that a pCCT could receive was 32 points, with a higher score indicating higher methodologic quality. Any disagreement on the risk of bias assessment was resolved after consulting another author (G.C.). The level of agreement between the 2 review authors was assessed with Cohen kappa statistics.
Data were considered suitable for pooling if the retrieved studies reported equivalent lateral cephalometric measurements. The data extracted from each trial were preliminarily annualized to minimize heterogeneity related to the observation period variability. Data were analyzed with the Review Manager software. For each evaluated continuous outcome, the mean differences and their corresponding 95% confidence intervals (95% CIs) were used to summarize and combine the data. A random-effects model was applied as the primary method to estimate all pooled data, since it was considered more appropriate in view of the differences in the samples and settings.
Assessment of heterogeneity
Clinical heterogeneity was evaluated by examining the types of participants and the interventions for the outcome in each included study. For all analyses, heterogeneity was assessed with the I 2 index, which is an indicator of true heterogeneity in percentages. A value of 0% indicates no observed heterogeneity, and greater values show increasing heterogeneity, with 25% indicating low, 50% moderate, and 75% high heterogeneity.
Assessment of publication bias
Publication bias was preliminarily evaluated by visual inspection of funnel plot asymmetry. Publication bias was also statistically investigated with specific tests: those of Begg and Mazumdar (rank correlation method) and Egger et al (weighted regression). The rank correlation method evaluates the relationship between the standardized treatment effect and the variance of the treatment effect using the Kendall tau. The weighted regression method treats the standardized treatment effect as the criterion and the precision of effect size estimation (inverse of its standard error) as the predictor in a regression model, estimated with observations weighted by the inverse of their variances.
Post hoc subgroup analyses were performed to evaluate the source of heterogeneity. In particular, monoblock vs dual-block designs of functional appliances and early vs late treatments were compared. The cutoff used to differentiate early and late treatment was the mean age of 11 years at the beginning of treatment. The significance level of subgroup analyses was set at P <0.1.
To evaluate the influence of the study quality on the overall estimates of the intervention’s effect, a sensitivity analysis was carried out, comparing the anterior maxillary displacement outcomes between RCTs and pCCTs.
Supplementary Table I presents the performed electronic searches, providing the following information for each search: electronic database, date of search, search strategy, and number of retrieved items. Among the 3122 initial identified articles, 2516 remained after the removal of duplicates. A total of 2325 articles were excluded on the basis of title and abstract; of the remaining 191 articles, 177 were excluded after evaluation of their full texts. Therefore, 14 studies were selected for qualitative and quantitative final synthesis. Figure 1 shows the flow diagram for the selection of studies, and the excluded articles with the reason for exclusion are shown in Supplementary Table II .
The characteristics of the 14 prospective trials included in the meta-analysis are summarized in Table I . All selected clinical trials evaluated treatment with RFAs in growing patients with a Class II malocclusion; most took place in a university setting or an educational institution. Five prospective studies were designed as RCTs, and 9 were pCCTs. The total number of treated patients was 405, whereas the overall control sample consisted of 360 untreated subjects. All studies included both male and female participants, except for one study that included only girls. In one study, the authors did not specify the numbers of male and female participants. The ages of the patients varied across the studies; 3 trials had samples with a mean age of 8 to 9 years, and 3 trials included older patients with a mean age of 12 to 13 years. The majority of the trials had a sample with an age range between 10 and 11 years.
|Study||Study design||Type of appliance||Sample size||Mean age ± SD (y)||Sex||Setting||Observation period (mo)||Time of daily appliance wear (h/d)||Cephalometric parameters||Follow-up|
|Courtney et al (1996)||RCT||Frankel-2, Harvold||Frankel-2, 13; Harvold, 12; controls,17||Overall sample, 12.63 ± 0.98||Frankel-2: 7 M, 6 F Harvold: 7 M, 5 F, controls: 11 M, 6 F||New Zealand, Dunedin Univesity of Otago||Treated and controls, 18.00||Not reported||SNA (°), A-point/horizontal (mm), S-N/ANS -PNS (°)||No|
|Jakobbson (1967)||RCT||Activator||Treated, 20; controls, 20||Treated and controls, 8.5||Treated and controls: 33 M, 27 F||Sweden, Solna Karolinska Institutet||Treated and controls, 18.00||13-14||A-point/RF (mm), ANS/RF (mm), PNS/RF (mm)||No|
|Martina et al (2013)||RCT||Sander bite jumping||Treated, 23; controls, 23||Treated, 10.9 ± 1.3; controls, 10.5 ± 1.2||Treated: 15 M, 8 F; controls: 11 M, 12 F||Italy, Naples Federico II University||Treated, 18.0; controls, 12.0||14||Ss point/OLp (mm)||No|
|O’Brien et al (2003)||RCT||Twin-block||Treated, 89; controls, 85||Treated, 9.7 ± 0.98; controls, 9.8 ± 0.94||Treated: 48 M, 41 F; controls: 46 M, 39 F||United Kingdom, multicenter study||Treated and controls, 15.00||Full time (24)||A-point/OLp (mm)||Yes|
|Tulloch et al (1997)||RCT||Bionator||Treated, 53; controls, 61||Treated, 9.4 ± 1.0; controls, 9.4 ± 1.2||Treated: 30 M, 23 F; controls: 35 M, 26 F||United States, University of North Carolina||Treated and controls, 15.00||Not reported||SNA (°), A/N perp (mm)||Yes|
|Baysal and Uysal (2014)||pCCT||Twin-block||Treated, 20; controls, 20||Treated, 13.0 ± 1.32; controls,12.17 ± 1.47||Treated: 10 M, 10 F; controls: 9 M, 11 F||Turkey, Erciyes University||Treated, 16.2 ± 7.5; controls, 15.5 ± 3.1||Full time (24)||SNA (°), A-point/OLp (mm)||No|
|Bilgic et al (2015)||pCCT||Activator||Treated, 20; controls, 20||Treated, 12.7 ± 1.5; controls, 13.8 ± 1.4||Treated: 11 M, 9 F; controls: 11 M, 9 F||Turkey, Diyarbakir Dicle University||Treated, 6.0; controls,6.0||Full time (24)||SNA (°), A/RP (mm) S-N/PP (°)||No|
|Illing et al (1998)||pCCT||Bionator, Twin-block||Bionator, 18; Twin-block, 16; controls, 20||Bionator, 11.8 ± 1.5; Twin-block, 11.5 ± 1.5; controls 11.2 ± 1.7||Bionator, 9 M, 9 F; Twin-block, 6 M, 10 F; controls: 13 M, 7 F||United Kingdom, London Royal Hospital||Treated and controls, 9.0||Full time (24)||SNA (°), Art/ANS (mm), S-N/maxillary plane (°)||No|
|Kumar et al (1996)||pCCT||Bionator||Treated, 16; controls, 8||Treated, 9.77 ± 1.59; controls, 9.64 ± 1.23||Treated and controls: 24 F||India, New Delhi Institute of Medical Sciences||Treated, 9.69 ± 1.14; controls, 9.26 ± 2.81||Full time (24)||SNA (°)||No|
|Lund and Sandler (1998)||pCCT||Twin-block||Treated, 36; controls, 27||Treated, 12.4; controls, 12.1||Treated: 19 M, 17 F; controls: 13 M, 14 F||United Kingdom, Chesterfield Royal Hospital||Treated and controls, 14.45||Not teported||SNA (°), S-N/maxillary plane (°)||No|
|Ozturk and Tankuter (1994)||pCCT||Activator||Treated, 17; controls, 19||Treated, 9.86 ± 0.48; controls, 10.12 ± 0.48||Treated: 8 M, 9 F; controls: 9 M, 10 F||Turkey, Instambul University||Treated and controls, 17.95||Not reported||SNA (°), NSL/NL (°)||No|
|Quintao et al (2006)||pCCT||Twin-block||Treated, 19; controls, 19||Treated, 9.5 ± 10; controls, 9.9 ± 13||Treated: 12 M, 7 F; controls: 12 M, 7 F||Brazil, Rio de Janeiro State University||Treated and controls, 12.00||Full time (24)||SNA (°)||No|
|Tumer and Gultan (1999)||pCCT||Activator, Twin-block||Activator, 13; Twin-block, 13; controls, 13||Activator, 11.9 ± 1.23; Twin-block, 11.5 ± 0.99; controls, 11.9 ± 1.16||Not reported||Turkey, Ankara University||Treated, 10.00; controls, 7.00||Monoblock, 16; Twin-block, 24||SNA (°), S-N/ANS-PNS (°)||No|
|Uner et al (1989)||pCCT||Activator||Treated, 11; controls, 11||Treated, 10.39 ± 1.03; controls, 10.31 ± 1.50||Treated: 1 M, 10 F; controls: 5 M, 6 F||Turkey, Ankara University||Treated, 9.09 ± 3.39; controls, 9.91 ± 5.07||14||SNA (°), S-N/ANS-PNS (°)||No|
The appliance features were heterogeneous among the selected studies: 6 trials evaluated the effects of the Twin-block, 5 trials evaluated the effects of the activator, and 3 trials evaluated the effects of the bionator. The following appliances were used in only 1 trial: Sander, Fränkel, and Harvold. The times of daily wear of the appliance varied across studies from 12 to 24 hours per day, and the observations varied from 6 to 18 months. No article reported long-term results (ie, at the end of growth). Seven trials evaluated a skeletal maturation index before treatment ; only 2 trials presented a follow-up after phase 2 treatment conducted with fixed appliances.
Risk of bias assessment
Only 1 RCT was at high risk of bias ( Table II ). In 2 RCTs, the risk of bias was unclear, and only 2 RCTs had a low risk of bias.
|Study||Sequence generation||Allocation concealment||Blinding of participants, personnel, and outcomes||Incomplete outcome data||Selective outcome reporting||Other risk of bias||Overall risk of bias|
|Courtney et al (1996)||Unclear||Unclear||Unclear||High risk||Unclear||Unclear||High risk|
|Jakobbson (1967)||Unclear||Unclear||Unclear||Unclear||Low risk||Low risk||Unclear|
|Martina et al (2013)||Low risk||Low risk||Low risk||Low risk||Low risk||Low risk||Low risk|
|O’Brien et al (2003)||Low risk||Low risk||Low risk||Low risk||Low risk||Low risk||Low risk|
|Tulloch et al (1997)||Low risk||Unclear||Unclear||Low risk||Low risk||Low risk||Unclear|
The risks of bias of the 9 pCCTs were scored as medium to low quality, with an average score of 16 of 32 ( Table III ) according to the Downs and Black scale. The interreviewer agreements for study selection, data extraction, and risk of bias assessment were suitable, with kappa values of 0.979, 0.958, and 0.913, respectively.
|Baysal and Uysal (2013)||8 of 11||1 of 3||4 of 7||3 of 6||5 of 5||21 of 32|
|Bilgic et al (2015)||8 of 11||1 of 3||3 of 7||2 of 6||5 of 5||19 of 32|
|Illing et al (1998)||8 of 11||1 of 3||4 of 7||4 of 6||0 of 5||17 of 32|
|Kumar et al (1996)||7 of 11||1 of 3||4 of 7||2 of 6||0 of 5||14 of 32|
|Lund and Sandler (1998)||8 of 11||1 of 3||4 of 7||2 of 6||0 of 5||15 of 32|
|Ozturk et al (1994)||7 of 11||1 of 3||3 of 7||2 of 6||0 of 5||13 of 32|
|Quintao et al (2006)||8 of 11||1 of 3||4 of 7||2 of 6||0 of 5||15 of 32|
|Tumer and Tankuter (1999)||8 of 11||1 of 3||3 of 7||3 of 6||0 of 5||15 of 32|
|Üner et al (1989)||8 of 11||1 of 3||4 of 7||2 of 6||0 of 5||15 of 32|
Assessment of publication bias
Visual inspection of the funnel plots ( Figs 2 and 3 ) indicated no publication bias. In addition, the rank correlation test of Begg and Mazumdar indicated no publication bias, showing P values of 0.118 and 0.458 for SNA angle and anterior maxillary displacement, respectively. The regression test of Egger et al showed P values of 0.333 and 0.264 for the same outcomes. However, although the rank correlation test and the regression test showed no significant results, this does not necessarily guarantee the absence of publication bias. Because of the small number of trials in this meta-analysis, it is possible that the power of these tests could be low.
Quantitative data synthesis
The mean difference in treatment effect of functional appliances, relative to the untreated controls for SNA angle, was −0.61° per year (95% CI, −0.69° to −0.25°; P <0.01; I 2 = 81%). These data were derived from a synthesis of 11 clinical trials with totals of 277 treated patients and 235 controls ( Fig 4 ).