Comparison of vacuum-formed and Hawley retainers: A systematic review


Hawley retainers (HRs) and vacuum-formed retainers (VFRs) are the 2 most commonly used retainers in orthodontics. However, the basis for selection of an appropriate retainer is still a matter of debate among orthodontists. In this systematic review, we evaluated the differences between VFRs and HRs.


Electronic databases (PubMed, EMBASE, Cochrane Library, ISI Web of Science, LILACS, and Pro-Quest) were searched with no language restriction. The relevant orthodontic journals and reference lists were checked for all eligible studies. Two article reviewers independently screened the retrieved studies, extracted the data, and evaluated the quality of the primary studies.


A total of 89 articles were retrieved in the initial search. However, only 7 articles met the inclusion criteria. Some evidence suggested that no difference exists to distinguish between the HRs and VFRs with respect to changes in intercanine and intermolar widths after orthodontic retention. In terms of occlusal contacts, cost effectiveness, patient satisfaction, and survival time, there was insufficient evidence to support the use of VFRs over HRs.


Additional high-quality, randomized, controlled trials concerning these retainers are necessary to determine which retainer is better for orthodontic procedures.

Retention is a critical phase of orthodontic treatment. After active orthodontic tooth movement, the teeth might be in an inherently unstable position and have a tendency to return to their pretreatment positions. Currently, the influences of the periodontal and gingival tissues, unstable positions of teeth, and continued skeletal growth are considered to be the major causes of relapse after removal of fixed appliances. To address this problem, retainers are used to prevent the teeth from returning to their former positions until gingival and periodontal reorganization and skeletal growth are essentially completed.

Although many types of retainers are available, the Hawley retainer (HR) and the vacuum-formed retainer (VFR) are the 2 most commonly used clinical retainers. The HR was designed by Charles Hawley in 1919, has been used for nearly a century, and has become the most popular removable retention appliance. The alternative removable retainer is an invisible retainer that was designed in 1971 and has been referred to by the following names: VFR, clear overlay retainer, and Essix retainer. In this review, for simplicity, we considered any invisible retainer as a VFR, instead of the other names. The most compelling potential advantages attributed to this invisible retainer are not only its durability and esthetic qualities, but also its small size and cleanability. Consequently, the use of VFRs has increased exponentially in recent years. In the United Kingdom, VFRs have become the most commonly used retainers in National Health Service, hospital, and private practices. However, there is little clinical evidence to support the use of VFRs over conventional HRs.

Several published studies have attempted to compare VFRs with HRs. Rowland et al conducted a prospective, randomized clinical trial and showed that VFRs were more effective than HRs in retaining the correction of the maxillary and mandibular labial segments. In addition, Demir et al also found that VFRs were more efficient in retaining the anterior mandibular teeth during a 1-year retention period. However, a recent retrospective, randomized, double-blind comparison study reported no statistically or clinically significant differences in the effectiveness of HRs and VFRs in maintaining specific arch-form features after orthodontic treatment. Other studies have compared these 2 appliances in terms of their cost-effectiveness, patient satisfaction, survival time, and occlusal contacts during retention. However, pertinent results were inconclusive, and some were unreliable; these studies could bias a clinician’s understanding and mislead clinical practice. Thus, we conducted a critical systematic review to evaluate and compare the significant effects of VFRs and conventional HRs. This systematic review might provide clinical evidence to help an orthodontist decide which retainer is appropriate for a particular patient.

Material and methods

Randomized or quasi-randomized, controlled clinical trials were included in this review.

Patients who had maxillary retainers, mandibular retainers, or both were included. There was no restriction regarding the type of active orthodontic treatment. The patients had to be followed for at least 6 months after completing their orthodontic treatment. However, patients with severe craniofacial deformities, cleft lip or palate, and poor periodontal status were excluded.

For this review, VFRs or HRs were selected as the final retainers for the patients after active orthodontic treatment. Additionally, the retainers had to cover the teeth, at least from first molar to first molar.

The primary outcomes included Little’s index of irregularity, intercanine width, intermolar width, and arch length related to the effectiveness of HRs and VFRs.

Secondary outcomes, including cost-effectiveness, patient satisfaction, survival time, and occlusal contacts for these 2 appliances, were extracted and collected.

Adverse effects on the periodontal health of the teeth, such as gingival and periodontal diseases, were also evaluated.

The following electronic databases were searched with no language restriction: Cochrane Central Register of Controlled Trials (CENTRAL; issue 1 of 12, January 2013), MEDLINE via PubMed (1960 to February 2013), EMBASE (1980 to February 2013), ISI Web of Science (1986 to February 2013), and LILACS (February 22, 2013). The search strategies are shown in Appendix I .

In addition, Pro-Quest Dissertation and Thesis database ( ) and Pro-Quest Science Journals ( ) were searched, with no limits set for the publication date.

A manual search was performed of these journals: American Journal of Orthodontics and Dentofacial Orthopedics , Angle Orthodontist , European Journal of Orthodontics , and Journal of Orthodontics (all from 1980 to 2012).

In addition, the conference proceedings and abstracts from the British Orthodontic Conference and the European Orthodontic Conference were searched. The reference lists of potential clinical trials were checked to identify any additional studies, and an additional search to update the results was undertaken in July 2013.

Two review authors (H.M. and Y.J.) independently screened the studies identified by the search strategies for relevance to this systematic review. Then the eligible studies were used independently for data extraction. Any disagreement between the 2 reviewers was resolved by discussion with another review author (J.H.) on the team.

Data extraction was also performed independently by 2 reviewers (H.M. and Y.J.), and disagreements were resolved by discussion with a third reviewer (J.H.). Data from the included studies were entered on a customized data collection form for details, including study design, study participants’ characteristics, course of interventions, and outcome measures.

In addition, if any ambiguities or lack of data was discovered in the articles, we attempted to contact the authors by mail to obtain more information.

Two reviewers (C.H. and M.L.) assessed the risk of bias in each included study independently. Disagreements were resolved by discussion with a third review author (J.H.), so that a consensus could be reached. This assessment followed the recommendations of the Cochrane Handbook for Systematic Reviews of Interventions (version 5.1.0). Six specific domains were assessed: sample size calculation, random sequence generation, allocation concealment, blinding of measurement assessment, reporting of withdrawals, and the use of an intention-to-treat analysis. The overall risk of bias in each study was assessed using the following judgments: low, moderate, and high. Studies were categorized according to the following.

  • 1.

    Low risk of bias (plausible bias unlikely to seriously alter the results), if 5 or more domains were considered adequate.

  • 2.

    Moderate risk of bias (plausible bias that raises some doubt about the results), if 3 or more domains were recorded with “yes.”

  • 3.

    High risk of bias (plausible bias that seriously weakens confidence in the results), if the study recorded “yes” in less than 3 domains.

Statistical analysis

Clinical heterogeneity was assessed by examining the participant types, interventions, and outcomes of each study. Ideally, a meta-analysis would have been performed if studies with similar comparisons reported comparable outcome measures. Risk ratio values would have been calculated along with 95% confidence intervals (CIs) for dichotomous data. Mean differences and 95% CIs would have been used for continuous data. Before each meta-analysis, chi-square and I-square (I 2 ) tests for homogeneity would have been undertaken. Heterogeneity would have been considered to exist with a P <0.10 or an I 2 value greater than 50%. The random-effects model would have been used to incorporate heterogeneity among studies; otherwise, the fixed-effects model would have been used. When necessary, a sensitivity analysis would have been conducted to test the robustness of the findings for the meta-analysis.


During the initial search, 89 articles were deemed potentially relevant to the review; 58 were rejected, including duplicates. Then we assessed the titles and abstracts of 33 articles, of which 19 were excluded. The primary reasons for rejection are shown in the PRISMA flowchart ( Appendix II ). The full texts of 14 articles were assessed extensively, and 7 studies were finally excluded. Five studies were cross-sectional studies that lacked a comparator, and 2 articles were systematic reviews ( Table I ). Therefore, only 7 articles met the inclusion criteria, and their characteristics are summarized in Table II .

Table I
Characteristics of excluded studies
Study Reason for exclusion
5 studies (references ) These were cross-sectional studies
2 studies (references ) These were systematic reviews
1 study (reference ) This study had high clinical heterogeneity with respect to the intervention groups

Table II
Summary of studies included in the systematic review
Study Methods Participants Interventions Outcomes
Rowland et al, 2007 RCT
The subjects were observed for 6 mo after debond
397 patients: male, 156; female, 241
Mean ages, 15 y (1.5) for VFR and 14.8 y (1.8) for HR
VFR: 201 patients, 24 h/d for the first week, then 12 h/d
HR: 196 patients, 24 h/d for 3 mo, then 12 h/d for another 3 mo
Changes in intercanine and intermolar widths; Little’s index of incisor irregularity for both groups
Barlin et al, 2011 RCT
Observed for 2, 6, and12 mo after debond
82 patients: male, 29; female, 53
Mean age for all subjects, 14.9 y (2.7)
VFR: 40; HR: 42
All subjects wearing retainers 24 h/d for 12 mo except for cleaning
Changes in intercanine and intermolar widths; Little’s index of incisor irregularity and arch lengths for both groups
Demir et al, 2012 CCT
Retention time was 1 y
42 patients: male, 12; female, 30
Mean ages, 13.8 y (3.1) for VFR and 12.9 y (2.5) for HR
VFR: 22; HR: 20
All the subjects wearing retainers 24 h/d for 12 mo except during meals
Irregularity index, intercanine width, and lengths of both arches
Hichens et al, 2007 RCT
The subjects were observed for 6 mo after debond
397 patients: male, 156; female, 241
Mean ages, 15 y (1.5) for VFR and 14.8 y (1.8) for HR
VFR: 201 patients, 24 h/d for the first week, then 12 h/d
HR: 196 patients, 24 h/d for 3 mo, then 12 h/d for another 3 mo
Cost-effectiveness and patient satisfaction between the retainer groups
Sun et al, 2011 RCT
Retention time was 12 mo
120 participants: male, 60; female, 60
Mean age of all subjects, 14.7 y
VFR: 59; HR: 61
Wearing retainers full time, except during meals
Survival times of the 2 types of retainers
Sauget et al, 1997 CCT
Retention time was 3 mo
30 participants: male, 11; female, 19
Mean ages, 19.6 y for VFR and 18.8 y for HR
VFR: 13, wearing retainers full time for first 3 days (except during meals) and nightly thereafter
HR: 17 (except 2 had maxillary HR only), wearing retainers full time except during meals
Occlusal contacts of the maxillary and mandibular teeth with the 2 types of retainers
Xu et al, 2011 RCT
Retention time for 12 mo
45 participants: male, 16; female, 29
Mean ages, 13.6 y for VFR and 15.2 y for HR
VFR: 25, wearing full time, except during meals and brushing
HR: 20, wearing nightly, but combined with mandibular fixed lingual retainers
Changes in overbite, overjet, intercanine and intermolar widths, Little’s index, and calculus index scores for both groups in the mandible
CCT , Controlled clinical trial.

All 7 included studies were parallel group studies. Five studies were randomized, controlled trials (RCTs), and 2 were controlled clinical trials. The changes in intercanine and intermolar widths and Little’s index of incisor irregularity were evaluated in 4 studies. However, 1 study compared VFRs and HRs combined with mandibular fixed lingual retainers; the measurements were made only in the mandible. In view of the high heterogeneity in the intervention groups, this study was excluded ( Table I ). Therefore, only 3 studies with high homogeneity were found among the selected articles and used for the data aggregation and analysis.

Of the included articles, the same research group published 2 articles. Although these articles had the same sample of subjects, the authors used different analyses and obtained different outcomes. The main outcomes of the study of Hichens et al were the cost-effectiveness and patient satisfaction between the retainer groups.

Detailed assessments of the bias risk from the included studies are shown in Table III . After we assessed the quality of these studies according to the assessment criteria, we deemed the overall bias risk of the 3 studies to be low ; another 3 trials were considered to have a moderate bias risk ; and the remaining study was classified as having a high bias risk.

Table III
Assessment of risk of bias
Study Design Sample size calculation Random sequence generation Allocation concealment Blinding of measurement Withdrawals reported ITT Risk of bias
Rowland et al, 2007 RCT Yes Yes Yes Yes Yes Yes Low
Barlin et al, 2011 RCT No Yes No Unclear Yes No Moderate
Demir et al, 2012 CCT Yes Unclear Unclear Yes Yes Unclear Moderate
Hichens et al, 2007 RCT Yes Yes Yes No Yes Yes Low
Sun et al, 2011 RCT Yes Yes Yes Yes Yes Yes Low
Sauget et al, 1997 CCT No No No Yes Moderate
Xu et al, 2011 RCT No Unclear Unclear Unclear Yes No High
Yes , Low risk of bias; no , high risk of bias; unclear , unclear or unknown risk of bias.
ITT , Intention to treat; CCT , controlled clinical trial.

Of all the included studies, the authors of 4 trials conducted an a priori sample size calculation. In these studies, the generation of a random sequence was considered adequate because they used computer-generated random sequence, blocked randomization method, or throw of dice. However, only 1 study used adequate allocation concealment, in which both patients and researchers were blinded to randomization during the participants’ enrollment. Although we contacted the authors of 2 studies with regard to this issue, we remained unclear about the allocation concealment in these studies.

In this review, only the blinding of outcomes was assessed. Because the retainers were visible to both patients and orthodontists, blinding of participants and personnel was not feasible in the trials. Four articles were evaluated to have a low risk of bias for outcome blinding. Although no blinding was used in 1 study, the outcomes and outcome measurements were not likely to have been influenced by the lack of blinding. In 6 studies, the number of dropouts and the reasons for withdrawal were clearly described, but an intention-to-treat analysis was conducted only in 3 trials. There were no withdrawals in 1 study.

The stability of the orthodontic patients’ teeth with HRs and VFRs after active orthodontic tooth movement was reported in 4 studies. The details of the outcomes and conclusions are described in Table IV . Three studies reported no significant differences in the ability of HRs and VFRs to retain the dentition in terms of intercanine and intermolar widths. No significant differences in arch lengths were reported in 2 studies.

Table IV
Effectiveness of HRs and VFRs
Study Arch Intercanine width Intermolar width Arch length Little’s index Conclusions
Rowland et al, 2007 Maxillary

Effective ( P = 0.013)
Effective ( P <0.01)
VFRs were more effective than HRs at holding corrections of the maxillary and mandibular labial segments
Demir et al, 2012 Maxillary
No statistically significant difference was found in the effectiveness between these 2 retainers
Barlin et al, 2011 Maxillary
There was no statistically or clinically significant difference in the measured arch width, arch length, or modified Little’s index over a 12-mo period
Only gold members can continue reading. Log In or Register to continue

Apr 6, 2017 | Posted by in Orthodontics | Comments Off on Comparison of vacuum-formed and Hawley retainers: A systematic review
Premium Wordpress Themes by UFO Themes