In this study, we compared the pretreatment conditions, treatment characteristics, and orthodontic outcomes of 3 groups of subjects selected for the American Board of Orthodontics (ABO) phase III clinical examination. One group was selected retrospectively by graduating residents just before their graduation. The 2 prospective groups were treated at separate institutions. The students at 1 institution were not aware that these patients would be potential ABO cases (prospective, blinded), but the students at the second institution were aware that these subjects would serve as their pool of potential patients for the ABO examination (prospective, unblinded). In addition to comparing the 3 groups, all cases were categorized as passing or failing based on their total objective grading system (ABO-OGS) score to assess the ABO-OGS criteria that were the most challenging to meet.
Chart histories and orthodontic dental casts (pretreatment and posttreatment) were collected for 133 subjects. Information regarding demographics, initial malocclusion type, treatment modality, treatment duration, appointment frequency, and missed appointments were collected from chart histories. Pretreatment dental casts were evaluated by using the discrepancy index; the index of complexity, outcome, and need; and the peer assessment rating. Posttreatment dental casts were evaluated with the peer assessment rating and the ABO-OGS.
The only significant pretreatment characteristic with predictive power for favorable orthodontic outcome was Angle Class I (3.1 odds ratio for passing the ABO-OGS) compared with the Class II subjects. The prospective unblinded group received more extraction and headgear therapy than did the other groups. The retrospective group had significantly lower total ABO-OGS posttreatment scores and a higher passing rate compared with the prospective groups.
Angle Class I malocclusions appear to have some advantage for achieving passing ABO-OGS scores, as does the retrospective selection of cases. Successful board certification appears difficult to accomplish based on a prospective model for orthodontic graduate residents. New graduate candidates might be at a disadvantage compared with traditional candidates because they often cannot take advantage of the posttreatment settling phase. Alignment, marginal ridges, and occlusal contacts appear to be where most points are deducted in the evaluation of ABO-OGS certification cases.
Since its inception in 1929, the American Board of Orthodontics (ABO) has striven to certify as many practicing orthodontists as possible and elevate the standards of the practice of orthodontics. When the percentages of board-certified orthodontists were 13% to 17% in the late 1970s, the board began efforts to increase the numbers of board-certified orthodontists. In 2001, the ABO began actively pursuing the idea and the feasibility of certifying graduating orthodontic residents in the resident clinical outcomes study, or the pilot study (PS). This led to a 4-year collaborative project between the ABO and 16 American orthodontic graduate programs accredited by the Commission on Dental Accreditation. This project investigated whether orthodontic residents could provide start-to-finish treatment for 6 patients with ABO-quality results.
At its inception, this PS was designed so that cases would be prospectively identified at the time of patient assignment, and residents, faculty, and patients at the participating programs would be aware of this PS designation. Orthodontic program directors at the participating institutions were asked to prospectively designate 12 patients for each incoming 2002 resident, 6 of whom would be presented to the board after treatment. Participating orthodontic residents would treat the patients, from banding to debanding, and be eligible to present the cases to the ABO to earn a 10-year time-limited certificate. Sixteen graduate orthodontic programs in the United States agreed to participate in the PS. Only 1 orthodontic program agreed to participate under the requirement that all persons involved in the treatment would be blinded to the designation of the prospectively selected PS patients. This method was chosen to prevent any differential treatment of the PS subjects.The subjects at the other 15 programs were treated in an unblinded fashion, so that residents, faculty, and patients knew they were participating in the PS.
During the PS, the study protocol was altered. The participant was allowed to present 6 cases for certification that included only 1 from the previously selected 12 that were prospectively designated. An incentive of a 15-year time-limited certificate was offered to residents presenting all prospectively selected PS cases, and 12% of PS examinees successfully earned this 15-year certificate.
The PS concluded in February 2006, when 50 participating orthodontic residents attended the ABO Clinical Examination. Forty-five candidates successfully obtained ABO certification, for a pass rate of 90%. This compared with 33 traditional candidates passing the examination at a rate of 85%. There was a mean difference of 2.38 ABO-OGS points for passing cases between the resident and the traditional examinees in the PS. The board concluded that the cases presented had sufficient complexity, with an average discrepancy index (DI) score of 16.96 for the student cases, compared with an average DI score of 21.84 for the regular examinees. The PS participants presented 422 cases, of which 58% were from the original prospectively selected PS group.
The board believed that this result positively affirmed that residents could treat to ABO standards during their orthodontic graduate programs. As a result, the board has now instituted new certification guidelines for recent graduates. The new certification process has provided the impetus for graduate programs to self-evaluate their patient populations and the quality of orthodontic treatment provided in each residency program.
In this study, we aimed to determine, with the aid of the DI; the peer assessment rating (PAR); the index of treatment complexity, outcome, and need (ICON); and the ABO objective grading system (ABO-OGS), whether there were significant differences in pretreatment conditions, treatment characteristics, and orthodontic treatment outcomes between ABO cases selected by using the 3 methods. We determined whether any pretreatment characteristics had predictive value in determining the orthodontic treatment quality outcome. Additionally, we categorically compared the ABO points deducted for failing vs passing cases, to determine which intraoral locations were the most difficult for the entire study sample.
Material and methods
All study procedures were approved by the institutional review board at the University of Washington. The sample consisted of complete chart histories and dental casts (pretreatment and posttreatment) of all subjects. The sample comprised 3 groups. Groups 1 (retrospectively selected) and 2 (prospective blinded) were collected from the retention archives of a graduate orthodontic program participating in the PS. Group 3 (prospective unblinded) was collected from another graduate orthodontic program participating in the PS.
No exclusion criteria were defined to prevent patients from participating in the study, as was the case with the initial PS guidelines. Patients were included irrespective of age, sex, race, or orthodontic problem if they were nonsyndromic and comprehensive orthodontic treatment was planned. The ABO stipulated that “cases should be representative of a cross section of clinical problems and of adequate difficulty to represent the resident’s ability to diagnose and treat orthodontic patients” in the original PS guidelines. When records were collected for this study, subjects were excluded if they were still in active orthodontic treatment, had incomplete records, had transferred treatment outside the assigned graduate clinic, or had never started treatment after the prospective PS designation.
According to these criteria, 49 records were initially collected for group 1, which included subjects who were retrospectively selected by the graduating residents (classes of 2003-2006) to be used in a simulated ABO examination. Two subjects were excluded from this group because of incomplete records, leaving 47 subjects in group 1. Group 2, the prospective blinded group, was prospectively selected to be part of the PS, and their study participation was concealed from all persons involved in the orthodontic treatment. Group 2 initially contained 57 subjects, but 16 were excluded based on the exclusion criteria, leaving 41 subjects. Group 3 records were gathered from another institution that treated patients in a prospective unblinded fashion. All persons involved (faculty, residents, patients) were aware of the PS designation. Group 3 started with 50 subject records, but 5 were excluded, leaving 45 prospective unblinded subjects. For the reasons outlined above, 23 subjects were excluded from the entire subject sample. All subject materials were deidentified and labeled with an identification number to facilitate investigator blinding.
Subject records consisted of chart histories, and pretreatment and posttreatment dental casts. Chart histories were reviewed to gather information about demographics, initial malocclusion, treatment type, treatment duration, frequency of appointments, and missed appointments during active orthodontic treatment. In a few cases, phase 1 treatment had previously been provided before the PS. In these circumstances, only information from the comprehensive phase of treatment was collected for these patients. This protocol is consistent with the ABO’s evaluation of 2-phase patients in traditional board examinations.
Pretreatment dental casts were scored by using the DI, PAR, and ICON by 2 calibrated, independent examiners. Posttreatment dental casts were scored by 2 examiners using the PAR, ICON, and ABO-OGS. The radiographic component of the ABO-OGS index was excluded, because many patients had no posttreatment panoramic radiographs. Additionally, numerous studies have questioned the usefulness of panoramic radiography to assess root parallelism because of inherent image distortion, especially in premolar extraction sites. The ABO-OGS scores were adjusted based on average PS radiographic deductions to account for this exclusion. The pretreatment and posttreatment casts from each site were combined, deidentified, assigned identification numbers, and measured in random order. Two investigators measured all dental casts independently, and the mean score was used unless significant differences were noted in the scores (weighted PAR, 5 points; weighted ICON, 9 points; ABO-OGS, 4 points). When differences were greater than these values, the dental casts were rescored by consensus, and the consensus score was used. Twenty-one (15.8%) cases had to be rescored by consensus.
To determine intraexaminer error, 10 casts were rescored later by each examiner. Intraexaminer error was evaluated by using the intraclass correlation coefficient for all examiners involved in the study ( Table I ).
|Intraclass correlation coefficient|
|Examiner 1 (BHS)||Examiner 2 (CJ)||Examiner 3 (SH)|
Pretreatment conditions and treatment characteristics were assessed and compared both qualitatively and quantitatively. These scores were compared with posttreatment conditions as assessed by the PAR, ICON, and ABO-OGS to determine treatment changes and the quality of orthodontic treatment between the groups.
Descriptive statistics (means, standard deviations, and ranges) were calculated for pretreatment DI and ICON, pretreatment and posttreatment PAR, and posttreatment ABO-OGS scores. Descriptive statistics were also performed for patient demographics, initial malocclusion (type and severity), treatment type, treatment duration, and number of orthodontic appointments. Analysis of variance (ANOVA) was used to test for differences in continuous variables between the groups. Pairwise between-group comparisons were carried out when ANOVA indicated differences between the groups. The Bonferroni adjustment to the significance level was used to correct for multiple comparisons in post-hoc analyses. This correction was applied to prevent inflation of the type 1 error rate caused by multiple comparisons. When the sample was divided into passing and failing cases, the average scores were compared with t tests.
Logistic regression was used to determine whether any pretreatment variables could be used as reliable predictors of successful board-quality treatment. A stepwise model-building algorithm was used to identify a subset of available covariates that was highly predictive of successful board-quality treatment.
The statistician was blinded to treatment group identification until the analyses were completed. For all analyses, the levels of significance were set at P <0.05 and P <0.017 when the Bonferroni adjustment was performed.
The 3 treatment groups were similar with respect to demographics and pretreatment characteristics ( Table II ). All groups had similar sex ratios, with more females than males. There were similar percentages of white patients in the groups. Group 3 had no Asian or Hispanic subjects and more black subjects (26.7%) compared with groups 1 and 2. There were more Class I subjects (53.2%) and fewer Class II subjects (38.3%) in group 1 (retrospectively selected group) compared with the other groups.
|Retrospective||Prospective blinded||Prospective unblinded|
|White||35 (74.5%)||33 (80.5%)||32 (71.1%)||100 (75.2%)|
|Asian||8 (17.0%)||2 (4.9%)||0.0||10 (7.5%)|
|Hispanic||1 (2.1%)||4 (9.8%)||0.0||5 (3.8%)|
|Black||1 (2.1%)||1 (2.4%)||12 (26.7%)||14 (10.5%)|
|Other||2 (4.3%)||1 (2.4%)||1 (2.2%)||4 (3.0%)|
|Male||21 (44.7%)||17 (41.5%)||18 (40.0%)||56 (42.1%)|
|Female||26 (55.3%)||24 (58.5%)||27 (60.0%)||77 (57.9%)|
|Class I||25 (53.2%)||17 (41.5%)||16 (35.6%)||58 (43.6%)|
|Class II||18 (38.3%)||23 (56.1%)||21 (46.7%)||62 (46.6%)|
|Class III||4 (8.5%)||1 (2.4%)||8 (17.8%)||13 (9.8%)|
|Anterior crossbite||15 (32.0%)||5 (12.2%)||10 (22.2%)||30 (22.6%)|
|Posterior crossbite||6 (12.8%)||9 (22.0%)||4 (8.9%)||19 (14.3%)|
|Deepbite||6 (12.8%)||7 (17.1%)||8 (17.8%)||21 (15.8%)|
|Missing teeth||2 (4.3%)||5 (12.2%)||5 (11.1%)||12 (9.0%)|
|Impactions||2 (4.3%)||2 (4.9%)||4 (8.9%)||8 (6.0%)|
|Extractions||20 (42.6%)||18 (43.9%)||35 (77.8%)||73 (54.9%)|
|Headgear||13 (27.7%)||12 (29.3%)||18 (40.0%)||43 (32.3%)|
|Orthognathic surgery||4 (8.5%)||4 (9.8%)||0.0||8 (6.0%)|
There were no significant differences between the groups regarding subject age at initial records, start of treatment, or end of treatment ( Table III ). The group 3 subjects had a younger average pretreatment age, but this was most likely because several older adults were included in groups 1 and 2. When age medians and ranges were examined, all 3 groups were similar at initial records. When average length of treatment was assessed, there was a significant difference between the groups, P = 0.004. With the Bonferroni adjustment for multiple comparisons, group 2 had a statistically significant increase in average length of treatment (31.3 months) compared with both group 1 (25.0 months, P = 0.005) and group 3 (25.1 months, P = 0.005). Likewise, there was a significant difference in the number of appointments, P = 0.04. However, when adjustments were made for multiple comparisons, only group 2 (28.6 appointments) reached statistical significance when compared with group 3 (24.4 appointments, P = 0.017). There was no significant difference in the numbers of missed appointments between the 3 groups. Group 3 had more subjects receiving extraction (77.8%) and headgear (40%) therapy during orthodontic treatment than those in groups 1 and 2 ( Table II ). Although it was difficult to quantify, it was known that some attending faculty in group 3 used treatment mechanics that included second-order tip-back bends. Attending faculty in groups 1 and 2 did not use this type of treatment.
|Retrospective||Prospective blinded||Prospective unblinded|
|Patient ages (y)|
|Age at initial records||18.2||13.8||8.7-66.5||12.7||47||19.0||13.8||10.9-54.7||10.6||41||16.0||13.3||9.2-43.8||7.9||45||0.386|
|Age at start of treatment||19.2||14.5||10.1-67.0||12.4||47||19.7||14.5||11.1-55.5||10.9||41||16.2||13.5||10.4-43.9||7.9||45||0.256|
|Age at end of treatment||21.3||16.4||13.2-68.6||12.3||47||22.3||17.1||13.7-57.0||10.7||41||18.3||15.7||12.0-44.7||8.1||45||0.175|
|Treatment length (months)||25.0||23.0||13.0-51.0||9.1||47||31.3||28.0||15.0-64.0||11.2||41||25.1||25.0||8.0-58.0||8.9||45||0.004 ∗|
|Number of appointments||25.2||23.0||14.0-48.0||8.0||47||28.6||27.5||16.0-59.0||8.1||41||24.4||24.0||8.0-55.0||7.9||45||0.040 ∗|
|Number of missed appointments||1.4||1.0||0-7.0||2.0||47||2.9||1.0||0-15.0||4.4||41||2.6||1.0||0-10.0||3.2||45||0.061|
There was no significant difference between the 3 groups for any pretreatment cast analyses (DI, ICON, or PAR, Table IV ). There was, however, a statistically significant difference between the groups for the posttreatment analyses (PAR and ABO-OGS). Group 1 (retrospective group) had statistically lower posttreatment PAR scores than did both prospectively selected groups (group 2, P = 0.001; group 3, P = 0.001). Likewise, for ABO-OGS scores, only the retrospective group 1 (16.2) was significantly different compared with groups 2 (23.1, P = 0.001) and 3 (28.4, P = 0.001).
|Retrospective||Prospective blinded||Prospective unblinded|
|Posttreatment PAR||2.1||0.0-8.0||1.9||47||5.1||1.0-23.0||4.2||41||4.8||1.0-19.5||4.0||45||0.001 ∗|