As there is currently no internationally accepted outcome measurement tool available for complete bilateral cleft lip and palate (CBCLP), the goal of this prospective study was to develop a numerical evaluation scale that allows reliable scoring of this cleft deformity. Our cohort comprised 121 Indian subjects with CBCLP who underwent surgical repair (mean age at time of surgery 6.53 months) using a modified Millard technique. A panel of three professionals evaluated each subject’s outcome of bilateral cleft lip repair 6 months postoperatively on two-dimensional (2D) full-face photographs in the frontal view and worm’s eye view. A simple two-point rating system was applied to separately analyse a total of 12 components of lip, nose, and scar. The results and mean scores for the analysed anatomical areas were 2.2 ± 1.01 (max = 3) for nose, 5.4 ± 1.54 (max = 8) for lip, and 1.9 ± 1.3 (max = 3) for scar, with a total score 7.7 ± 2.21 (max = 12) indicating a good surgical outcome. The inter-examiner ICC for nose, lip, scar, and total score was calculated at 0.836, 0.889, 0.723, and 0.927 respectively and indicated a strong level of repeatability and reliability that was highly significant ( P < 0.001). In conclusion, we were able to develop and test a scoring system for measuring outcomes in CBCLP that warrants simplicity of use, reliability and reproducibility.

Although there is consensus among healthcare providers that evaluation of nasolabial morphology in cleft lip and palate (CLP) is one of the key elements in measuring treatment success, to date there remains controversy regarding a commonly accepted rating system. A number of previous studies on subjects with unilateral cleft lip and palate (UCLP) have included direct and indirect anthropometrics of hard- and soft tissues, panel-based evaluations of professionals and laymen, linear and area analyses of two-dimensional (2D) photographs, and more recently three-dimensional (3D) images, primarily measuring and comparing the symmetry of the cleft- to the non-cleft side .

Outcome evaluation in bilateral cleft lip and palate (BLCLP) poses a somewhat greater challenge to the assessor, primarily because of the absence of a normal side to compare with. Because of the multidimensional complexity of the initial cleft deformity this patient sub-group is after all the most likely to require multiple revision surgeries until adulthood .

There is currently no scale available which is “tailored” for the purpose of evaluation of the bilateral cleft deformity and which is devoid of expensive technology, time consuming and complicated quantitative and anthropometric measurements of angles and landmarks.

To overcome these shortcomings, the aim of our study was to develop and test a binary scoring system and assessment scale that warrants simplicity of use, reliability, and reproducibility, and that allows for the analysis of 12 relevant components of the nasolabial area of the bilateral cleft deformity in a time efficient manner.

Materials and methods

During the period of February 2007 until February 2014, a total of 121 consecutive Indian patients (89 males, 32 females) with non-syndromic complete bilateral cleft lip and palate (CBCLP) were selected for this prospective study. All subjects underwent primary lip repair without nasal correction using a modified Millard technique until the age of 12 months (mean age at time of surgery 6.53 months, range 3–12 months) performed by two senior surgeons (K.B. and P.N.S) at the Smile Train Cleft Palate Centre, Bhagwan Mahaveer Jain Hospital, Bangalore, India. Two-dimensional (2D) full-face photographs in the frontal view and worm’s eye view were acquired 6 months postoperatively by the same experienced photographer using a Canon EOS 500D digital camera (Canon Inc., Tokyo, Japan). Both views were taken with the head in a rest position against a green background. The frontal view was taken with both ears visible to minimize rotation and least nostril show to minimize tilt. The worm’s eye view was taken with nasal tip projected between the medial canthi and eyebrows ( Fig. 1 ). The photographic records were stored as JPEG files and further processed through ImageJ software.

Full-face photographs for the assessment of the bilateral cleft lip repair outcome, taken six months postoperatively in (A) frontal view and (B) worm’s eye view.

Reformatted and equal sized images in the frontal and the worm’s eye view were printed on high-quality photographic paper and distributed among professional panellists for evaluation. The panel consisted of three former cleft surgery fellows, with 1 year of postgraduate experience, who were not involved in the treatment of the patients enrolled in this study.

Assessment of the outcome of bilateral cleft repair was conducted by a binary numerical evaluation scale of three major evaluation areas including lip, nose, and scar, subdivided into a total of 12 anatomical subunits and items, of which each was separately analysed for 15 seconds and scored. A simple two point rating system was applied, where a score of 1 was given if the parameter was found to be acceptable (normal) and score of 0 was given when the postoperative result was considered not to be satisfactory for that particular parameter ( Table 1 ).

Evaluation scale for lip, nose and scar of the postoperative residual bilateral cleft deformity.
Evaluation area Score 0 Score 1
1. Prolabium width (the prolabium width should be normal (3–4 mm from the midline; total width: 6–8 mm) when compared to lip dimension) Increased Normal
2. Cupid’s bow (the Cupid’s bow peaks should be on the same horizontal plane) Absent Present
3. Vermilion symmetry (equal width of wet and dry vermillion on both sides with smooth continuity) Absent Present
4. Vermilion notching (the vermillion shall be a continuous smooth curved line without notching) Present Absent
5. Premaxillary show (absence of premaxillary show at rest) Present Absent
1. Columella height (the columella height should be adequate (4–5 mm) but not reduced or absent) Reduced Normal
2. Nostril symmetry (nostrils should be symmetrical on either side) Absent Present
3. Bialar width (bialar width should be within the medial canthus of the eyes) Increased Normal
4. Nasal tip (It is a qualitative evaluation of the nasal shape. Nasal tip should be well defined.) Broad Well Defined
1. Discoloration Present Absent
2. Suture marks/Spreading of scar Present Absent
3. Hypertrophy Present Absent

Nose scoring was evaluated by frontal view (nasal tip) and worm’s eye view (columella height, nostril symmetry, bialar width); lip and scar scoring was evaluated by frontal view. A score of 1 was given if the analyzed postoperative component was acceptable (normal) and a score of 0 was given if unsatisfactory. A maximum score of 12 for all subunits was obtainable.

Statistical analysis

All statistical analysis was performed with SPSS 16.0 software (Chicago, IL, USA). To evaluate the inter-examiner reliability, the agreement between three professional raters regarding the scoring of our assessment scale, an intra-class correlation coefficient (ICC) was calculated. Higher ICC values indicate greater inter-examiner reliability, with an ICC estimate of 1 indicating perfect agreement while 0 indicates only random agreement. Negative ICC values indicate systematic disagreement. Cicchetti provides commonly-cited cut-offs for qualitative ratings of agreement based on ICC values, which can be interpreted as follows: inter-examiner reliability is poor for ICC values <0.40, fair for values between 0.40 and 0.59, good for values between 0.60 and 0.74, and excellent for values between 0.75 and 1.0.

Pilot study

Initially, a pilot study was conducted on 10 CBCLP cases for training and calibration purposes and to test the inter-examiner reliability of the assessment scale among three professional panellists (former cleft surgery fellows). The results of the pilot study showed statistically significant intra-class correlation coefficient (ICC) values for the inter-examiner reliability for nose, lip, scar, and total scores, which were 0.4977 ( P = 0.005), 0.8334 ( P = 0.001), 0.3237 ( P = 0.047), and 0.8072 ( P = 0.001) respectively. ICC scores of inter-examiner reliability for lip and total scores showed excellent agreement and poor to fair agreement was seen for scar and nose scores.

Although agreement for scar and nose scores must be considered as not ideal, we interpreted these values as acceptable for a pilot study taking into account that raters were not yet sufficiently familiar with the new assessment scale, which might have caused statistically significant shifts of values in a small sample size ( n = 10). As an increased sample size always improves the statistics this might explain higher ICC values obtained during the main study ( n = 121).


The grades for total scores, nose scores, and lip and scar scores, which were assessed 6 months postoperatively, are summarized in Table 2 . Secondary revision surgeries were needed only in 16 cases (13%) of all cases ( n = 121) cases until today, generally indicating a good outcome of the primary treatment. Patients who had to undergo surgical revision had a significantly lower mean ± SD score (5.3 ± 3.07) than the total mean ± SD scores of all the cases (7.7 ± 2.21) (maximum score = 12) A summary of total and individual grades and scores given for each anatomical component of the residual bilateral cleft lip deformity is depicted in Table 2 .

Table 2
Summary of total and individual grades and scores given for anatomical components of the residual bilateral cleft lip deformity.
Grades All components (nose, lip and scar) Grades Nose Grades Lip and scar
Number of patients (%) Number of patients (%) Number of patients (%)
Excellent (10–12) 34 (28%) Excellent (4) 11 (9%) Excellent (7–8) 34(28%)
Good (8–9) 47 (39%) Good (4) 42 (35%) Good (5–6) 60(50%)
Fair (6–7) 16 (13%) Fair (2) 43 (35%) Fair (3–4) 22(18%)
Poor (<6) 24 (20%) Poor (<1) 25 (21%) Poor (<2) 5(4%)
Total (%) 121 (100%) Total (%) 121 (100%) Total (%) 121 (100%)
Mean score ± SD 7.7 ± 2.21 Mean score ± SD 2.2 ± 1.01 Mean score ± SD 5.4 ± 1.54
Maximum score 12 Maximum score 4 Maximum score 8
