Contribution of malocclusion and female facial attractiveness to smile esthetics evaluated by eye tracking


There is disagreement in the literature concerning the importance of the mouth in overall facial attractiveness. Eye tracking provides an objective method to evaluate what people see. The objective of this study was to determine whether dental and facial attractiveness alters viewers’ visual attention in terms of which area of the face (eyes, nose, mouth, chin, ears, or other) is viewed first, viewed the greatest number of times, and viewed for the greatest total time (duration) using eye tracking.


Seventy-six viewers underwent 1 eye tracking session. Of these, 53 were white (49% female, 51% male). Their ages ranged from 18 to 29 years, with a mean of 19.8 years, and none were dental professionals. After being positioned and calibrated, they were shown 24 unique female composite images, each image shown twice for reliability. These images reflected a repaired unilateral cleft lip or 3 grades of dental attractiveness similar to those of grades 1 (near ideal), 7 (borderline treatment need), and 10 (definite treatment need) as assessed in the aesthetic component of the Index of Orthodontic Treatment Need (AC-IOTN). The images were then embedded in faces of 3 levels of attractiveness: attractive, average, and unattractive. During viewing, data were collected for the first location, frequency, and duration of each viewer’s gaze.


Observer reliability ranged from 0.58 to 0.92 (intraclass correlation coefficients) but was less than 0.07 (interrater) for the chin, which was eliminated from the study. Likewise, reliability for the area of first fixation was kappa less than 0.10 for both intrarater and interrater reliabilities; the area of first fixation was also removed from the data analysis. Repeated-measures analysis of variance showed a significant effect ( P <0.001) for level of attractiveness by malocclusion by area of the face. For both number of fixations and duration of fixations, the eyes overwhelmingly were most salient, with the mouth receiving the second most visual attention. At times, the mouth and the eyes were statistically indistinguishable in viewers’ gazes of fixation and duration. As the dental attractiveness decreased, the visual attention increased on the mouth, approaching that of the eyes. AC-IOTN grade 10 gained the most attention, followed by both AC-IOTN grade 7 and the cleft. AC-IOTN grade 1 received the least amount of visual attention. Also, lower dental attractiveness (AC-IOTN 7 and AC-IOTN 10) received more visual attention as facial attractiveness increased.


Eye tracking indicates that dental attractiveness can alter the level of visual attention depending on the female models’ facial attractiveness when viewed by laypersons.


  • Eye tracking provides an objective method to evaluate what people view.

  • Seventy-six viewers underwent eye tracking sessions while viewing facial photographs.

  • Data were collected for the first location, frequency, and duration of the viewers’ gaze.

  • For number and duration of fixations, eyes were the most salient feature and the mouth the second.

  • As dental attractiveness decreased, visual attention on the mouth increased.

There is a lack of agreement in the literature concerning the relative importance of the mouth and teeth in overall facial attractiveness. Some investigators have assigned substantial importance to the attractiveness and social benefits of the mouth and teeth. Some investigators have also acknowledged that background attractiveness is an important factor in facial attractiveness and can override dental features. Other investigators have found that the mouth and teeth make limited and defined contributions to facial esthetics and that background or overall facial attractiveness is more important than any one facial feature. Despite the uncertainty concerning the relative importance of dental appearance in facial attractiveness, most people seek orthodontic treatment primarily to improve esthetics.

Several recent studies have presented evidence that individual smile components affect the smile’s attractiveness or acceptability, and that these results are modified by the nationality, ethnicity, sex, and facial attractiveness of the facial image viewed (model). A shortcoming of these studies is that they directed the raters to focus on the mouth or selected oral features of the model. This may bias the results and does not directly address how viewers spontaneously respond to the mouth when viewing a smiling person. This is of concern because self-reported data from those viewing facial images indicated that the eyes and mouth are the most important, whereas others find that other facial features are viewed first and most frequently.

These issues of dental and oral attractiveness really fall into the domain of facial recognition perception. Facial recognition is physiologically and anatomically based. Initial face perception begins in specific areas of the brain. It appears that there are differences in facial perceptions based on sex and on the emotions displayed by the faces of the image.

With this in mind, another approach to facial perception is to determine what viewers actually look at when presented with different facial images. Such an approach provides a direct and objective method for evaluating what facial features are important to a viewer. The premise is that features that attract the viewer’s gaze are informative or have salience in determining what is viewed, including its relative attractiveness.

Eye movements involve a series of quick, jerky movements known as saccades between stops known as fixations. It is possible through the use of computer hardware and software to record eye movement with a pupillary-corneal reflection technique called eye tracking. The brain records information only during the fixations of eye movement. By recording the location and duration of fixations, it is possible to learn what features the person considers most pertinent. If a feature is of great interest to the viewer, his or her eyes will be drawn to this particular feature. Similarly, several studies have shown that regions or objects that are considered informative may not influence the initial fixation but do influence the fixation density (number of fixations) and the total duration of the fixations. Objects or regions with high informativeness receive greater fixation density.

If a viewer is presented with a face free of anomalies and with a neutral expression, he or she will tend to fixate on certain features of the face: eg, eyes, nose, and mouth. Racial and ethnic differences between the observers and the images viewed may also be a factor in the ability to make reliable judgments. A recent eye-tracking study found that there are differences between participants of different races, and that visual cuing could alter their gaze.

Despite the widespread use of eye tracking in other disciplines, there has been surprisingly little use of this technology in dental research. Meyer-Marcotty et al used eye tracking to provide evidence that images of subjects with cleft lip and palate are looked at differently than those who do not have cleft lip and palate. Hickman et al conducted a study to determine the location, order, and duration of viewers’ visual fixations on facial features using eye tracking. The viewers were shown images of patients after orthodontic treatment. They reported that there was no single facial feature that viewers’ eyes preferentially fixed on in well-balanced faces.

Because eye-tracking data have established a hierarchy of visual attention, the purpose of this study was to determine whether dental and facial attractiveness affects viewers’ visual attention and can alter that hierarchy. In other words, when objectively measured by eye tracking, what do people look at in facial images when the dental and facial appearances change from near ideal to average or normal to recognizably poor?

Material and methods

The institutional review board at Ohio State University approved the study. Preliminary steps were necessary to create the composite images with varying levels of facial and dental attractiveness used in this study. This allowed us to pair truly comparable and reliably rated faces and dentitions rather than searching for naturally occurring combinations and having to accept compromises.

Facial images were obtained by seeking subjects (18-30 years of age) on an availability basis on our campus (n = 207). Two frontal facial portraits (1 with lips together showing no teeth, and 1 with the reliable posed smile ) of each person were taken using a digital single-lens reflex camera (D60; Nikon, Tokyo, Japan) mounted on a tripod. Models who were judged by the researchers to have a significant distraction—eg, facial tattoo, extreme hairstyle, extreme facial hair, asymmetry, abnormal piercing—were eliminated.

Thirty-six young adults with no professional dental expertise rated 199 smiling facial images with the lips together showing no teeth. This removed the dental variables and showed each viewer the same emotional context, which has been noted as important. They rated the images as “unattractive,” “average attractiveness,” or “attractive”: 1, 2, or 3, respectively, without further definition. The images were projected (approximately 5-times magnification) in random order on a screen. Forty-one images (20.6%) were repeated in random order to determine reliability. The intrarater kappa statistic was 0.66 (95% confidence interval [CI], 0.58-0.74), and the interrater kappa statistic was 0.51 (95% CI, 0.49-0.52).

The images were then sorted based on mean attractiveness ratings into groups based on attractiveness. The first group had a mean facial attractiveness less than 1.5 (unattractive); the second group had ratings near 2 (1.8-2.4), denoting average attractiveness; and the third group had ratings equal to or greater than 2.5 (attractive).

The next step involved collecting images of dentitions. These were obtained by searching the records of our university’s orthodontic clinical archive. These images showed only the teeth and associated intraoral structures.

The aesthetic component (AC) of the Index of Orthodontic Treatment Need (IOTN) provides 10 grades of dental attractiveness, which are divided into 3 subgroups: (1) grades 1 through 4 are considered to have little or no treatment need, (2) grades 5 through 7 have a borderline treatment need, and (3) grades 8 through 10 have a definite treatment need. Each of the dental frontal images was assigned an AC-IOTN grade (1-10) by a researcher (M.R.R.). In accordance with the intention of the index, the esthetic grade, and not the malocclusion or position of the teeth, was matched. All images were then evaluated by experienced orthodontists (15 full-time and part-time university faculty) to confirm the grade of dental attractiveness as defined by the AC-IOTN. Each rater was given printed pages of the frontal intraoral images to rate ( Fig 1 ). IOTN “gold-standard” photos were provided for each grade. The raters marked any image that they thought did not match the standard for that grade. They rated all images twice for reliability purposes. The intrarater reliability kappa statistic was 0.72 (95% CI, 0.64-0.80), and the interrater reliability was 0.56 (95% CI, 0.53-0.59).

Fig 1
Example of a sheet the raters viewed when rating the malocclusions against the standard for the AC-IOTN grade (at the top to the sheet). Raters placed an “X” through any image that they thought was not similar to the standard.

For this study, images of AC-IOTN grades 1, 7, and 10 were paired with the 3 levels of facial attractiveness. Additionally, the 3 facial attractiveness levels were combined with a repaired unilateral cleft lip. This created another level of facial attractiveness. This was done to establish the validity of the eye-tracking method for the oral region of the face. Based on the work of Shaw, subjects with a repaired unilateral cleft lip were judged as the most unattractive when compared with other oral characteristics studied such as a missing incisor, crowded incisors, prominent incisors, and normal incisors. This cleft assessment was used to verify that it was possible to draw visual attention to the oral area if the other grades of dental attractiveness failed, since it had previously proven to be a significant distractor.

Images of poorly repaired unilateral cleft lips, as judged by 1 researcher (M.R.R.), were selected from university records and cropped to include only the area between subnasale and the cervical gingiva of the maxillary dentition. No other information was associated with these cropped images.

The images of the different grades of dental attractiveness were combined with the images of different levels of facial attractiveness to form composite images ( Fig 2 ) using image processing software (Adobe Photoshop Elements 7.0; Adobe Systems, San Jose, Calif). All unattractive facial images had an attractiveness rating less than 1.40. The average faces’ attractiveness ratings ranged from 1.90 to 2.11, and the attractive faces had ratings greater than 2.60.

Fig 2
Examples of composite images (malocclusion + facial attractiveness) created for this study. Note that the images were not mosaicked during the study.

The 3 grades of dental attractiveness plus the unilateral cleft lips were placed in each of the 3 levels of facial attractiveness. The faces of the models were not bisected and mirrored to produce bilaterally symmetric faces, based on evidence that this was not necessary. This provided 12 possible combinations of dentitions and facial attractiveness. Two different composite images were created for each combination of facial attractiveness and dental attractiveness, resulting in 24 unique images to broaden the viewing possibilities. For the cleft images, the teeth were AC-IOTN grade 1 (no treatment need) to eliminate the variable of dental attractiveness.

Software (Experiment Builder; SR Research, Katana, Ontario, Canada) was used to construct the program to run the study on the eye tracker. Areas of interest of the face defined where the viewer’s gaze paused for 80 ms or longer (a fixation) when viewing each image, in essence creating a map of the face ( Fig 3 ). These areas were forehead, hair, eyebrows, eyes, nose, mouth, cleft (if present), cheeks, chin, and ears. A small gap was left between interest areas to ensure accuracy of a fixation. If a fixation landed on a gap between areas of interest, it was recorded as “other.” This eliminated the potential problems of assigning a fixation to an interest area if it landed on the borders of 2 areas interest, and it also dealt with the 0.25° to 0.50° of viewing angle error of the eye tracker.

Fig 3
Example of an image (not used in study) showing the demarcation of interest areas. Viewers did not see the yellow lines.

Seventy-eight viewers were recruited using the Department of Psychology’s Research Experience Program and by posting flyers on the main campus of the university. Dental professionals (including dental and dental hygiene students) were excluded from the study. As an incentive to encourage participation, the viewers from the Research Experience Program received a partial credit hour, and the flyer-recruited viewers received a monetary incentive. The inclusion criteria consisted of the following: 18-30 years of age, able to understand English, no prior or existing neurologic condition, normal to corrected-to-normal color vision (no hard contact lenses), no recent use of alcohol or other drugs, not currently using any medication that might affect cognitive abilities, and not wearing mascara or unwilling to remove it.

Deception was used to avoid biasing or “priming” the viewers as they looked at the images. They were told that the title of the study was “Visual exploration of faces recorded from an eye-tracking system” and that its purpose was to “help us understand how individuals view other people.” Upon completion of the study, a debriefing form was provided to explain the reason for the deception.

After obtaining consent, the viewers were positioned in the eye tracker (EyeLink 1000; SR Research) and calibrated ( Fig 4 ). At the beginning of the session, they were instructed to simply look at the images shown on the computer screen. The eye-tracking session began by showing the viewers 5 sample images to check the equipment, and then all 24 images in random order. Each image was displayed for 3 seconds. After each image a blank screen with a randomly placed “X” was displayed for 1 second. After all 24 images had been displayed, viewers were shown all 24 images again in a new random order to determine interrater and intrarater reliabilities. As these images were shown, data were being collected concerning what area of the face they looked at first, what area of the face they viewed the most frequently (fixation density), and what area of the face they viewed for the longest total duration (ie, the sum of the durations of all fixations in that area recorded in milliseconds).

Fig 4
An adult (not a study participant) positioned in a table-mounted, eye-tracking device.

After the eye tracking, the viewers were asked to fill out a short questionnaire asking what facial feature they looked at first and then second and for the longest time when meeting someone for the first time. They were also asked for voluntary demographic data including age, sex, and ethnicity. These variables have been cited as influencing viewer ratings. Only white models were used in an attempt to eliminate the variable of ethnicity because different ethnic groups have varied facial features and smile characteristics. Also, ethnicity can be a factor among the raters when viewing faces of their own race and other races.

Statistical analysis

With a nondirectional alpha risk of 0.05 and assuming a standard deviation of 2.54, a sample size of 50 viewers was required to demonstrate a difference of ±1.5 fixations with a power of 0.962. Again, with a nondirectional alpha risk of 0.05 and assuming a standard deviation of 793 ms, a sample size of 50 would yield a power of 0.979 to detect a difference of ±500 ms in total duration time.

Dependent variables (area of first fixation, area of greatest fixation density, and area of maximum total duration) were analyzed using repeated-measures, factorial analysis of variance (ANOVA). Independent variables were grade of dental attractiveness, model facial attractiveness, area of the face, and rater’s sex. Post hoc testing was done using the Tukey-Kramer procedure.


Seventy-eight viewers entered the study, and 76 finished the study. Two viewers were disqualified because of difficulty in calibrating the eye tracker and not being able to keep it calibrated throughout the eye-tracking session. Only white viewers (n = 53) were used in the analysis to remove the known racial and ethnic variabilities. Of these viewers, 49% (26) were female, and 51% (27) were male. Their ages ranged from 18 to 29 years, with a mean of 19.8 years.

Intrarater reliability differed among the dependent variables. For the area of first fixation, the kappa values were 0.08 for intrarater reliability and −0.03 for interrater reliability. For this reason, the area of first fixation was dropped from the study.

For fixation density and duration, intrarater viewer reliability ranged from ICC values of 0.61 to 0.91, and interrater reliability ranged from 0.58 to 0.88. Because of poor reliability for the chin (ICC <0.10), it was eliminated from the study ( Table I ).

Table I
Reliability for total duration of fixations and fixation density by area of the face
Intrarater reliability Interrater reliability
Fixation duration
Mouth 0.91 0.90 0.92 0.86 0.84 0.87
Eye 0.92 0.91 0.93 0.88 0.87 0.89
Ear 0.76 0.74 0.78 0.62 0.58 0.65
Nose 0.87 0.86 0.88 0.81 0.80 0.83
Chin 0.81 0.80 0.82 0.06 0.00 0.12
Other 0.83 0.82 0.84 0.65 0.62 0.68
Fixation density
Mouth 0.91 0.90 0.92 0.86 0.84 0.87
Eye 0.91 0.90 0.92 0.85 0.83 0.86
Ear 0.72 0.70 0.74 0.58 0.54 0.62
Nose 0.83 0.82 0.84 0.72 0.69 0.75
Chin 0.61 0.58 0.63 −0.18 −0.23 −0.13
Other 0.80 0.79 0.81 0.61 0.57 0.64
Reliability is designated as follows: <0.20, poor; 0.21-0.40, fair; 0.41-0.60, moderate; 0.61-0.80, good; and >0.80, very good.
ICC , Intraclass correlation; LCB , lower confidence boundary 2.5%; UCB , upper confidence boundary 97.5%.

Repeated-measures ANOVA showed a significant effect ( P <0.001) for the level of facial attractiveness by dental attractiveness by area of the face ( Table II ). This was seen for both the total duration of fixations and fixation density (total number of fixations). For both variables, the eyes overwhelmingly were the most salient. Whether for total duration or fixation density, the eyes were viewed more than the mouth, which was viewed more than the nose. The areas of “other” and “ear” rounded out the top 5 response areas ( Figs 5 and 6 ). As the dental attractiveness decreased, visual attention for total duration and fixation density decreased for the eyes and increased on the mouth, sometimes approaching that of the eyes ( Figs 7 and 8 ).

Table II
ANOVA summary for fixation density and fixation duration
Source of variation Numerator (DF) Denominator (DF) F value P
Fixation density
Attract 2 102 0.08 0.9245
IOTN 3 153 3.02 0.0316
Attract by IOTN 6 306 0.6 0.7325
Area 4 204 1263.61 <0.0001
Attract by area 8 408 3.19 0.0016
IOTN by area 12 612 21.42 <0.0001
Attract by IOTN by area 24 1224 5.02 <0.0001
Rater sex 1 51 0.24 0.6264
Attract by rater sex 2 102 0.23 0.7943
IOTN by rater sex 3 153 0.23 0.8785
Attract by IOTN by rater sex 6 306 0.73 0.6258
Area by rater sex 4 204 14.92 <0.0001
Attract by area by rater sex 8 408 0.34 0.9481
IOTN by area by rater sex 12 612 0.28 0.9926
Attract by IOTN by area by rater sex 24 1224 0.99 0.4733
Fixation duration
Attract 2 102 0.02 0.9839
IOTN 3 153 1.07 0.3639
Attract by IOTN 6 306 0.95 0.462
Area 4 204 1223.01 0.0001
Attract by area 8 408 4.90 <0.0001
IOTN by area 12 612 22.06 <0.0001
Attract by IOTN by area 24 1224 4.17 <0.0001
Rater sex 1 51 0.3 0.5833
Attract by rater sex 2 102 0.15 0.8612
IOTN by rater sex 3 153 0.0 0.9998
Attract by IOTN by rater sex 6 306 0.06 0.9991
Area by rater sex 4 204 15.54 <0.0001
Attract by area by rater sex 8 408 0.4 0.9229
IOTN by area by rater sex 12 612 0.98 0.4667
Attract by IOTN by area by rater sex 24 1224 0.61 0.9332
Only gold members can continue reading. Log In or Register to continue

Apr 6, 2017 | Posted by in Orthodontics | Comments Off on Contribution of malocclusion and female facial attractiveness to smile esthetics evaluated by eye tracking
Premium Wordpress Themes by UFO Themes