Predicting extracapsular spread of head and neck cancers using different imaging techniques: a systematic review and meta-analysis

Abstract

This study compared the diagnostic ability of computed tomography (CT), magnetic resonance imaging (MRI), ultrasonography (US), and positron emission tomography/CT (PET/CT) for extracapsular spread. MEDLINE, EMBASE, China National Knowledge Infrastructure, Chinese Biomedical Literature Database, and Sciencepaper Online databases were searched. The mean sensitivity of CT was 0.77, specificity was 0.85, positive likelihood ratio (LR+) was 4.839, negative likelihood ratio (LR−) was 0.287, diagnostic odds ratio (DOR) was 19.239, area under the summary receiver operating characteristic curve (AUC) was 0.8615, and Q * was 0.7922. The mean sensitivity of MRI was 0.85, specificity was 0.84, LR+ was 4.615, LR− was 0.191, DOR was 60.270, AUC was 0.9454, and Q * was 0.8844. The sensitivity and specificity of PET/CT were both 0.86. The mean sensitivity of US was 0.87 and specificity was 0.75. Overall, CT had the lowest sensitivity ( P = 0.0355); specificity was similar for all methods ( P = 0.1159). CT and MRI had equivalent summary diagnostic efficacy (AUC and Q *) ( P > 0.05). This evidence indicates that CT might have a relatively lower sensitivity when diagnosing ECS, and that CT and MRI may be similarly effective in diagnosing ECS. MRI showed positive trends in diagnosing ECS. Evidence was lacking for PET/CT and US diagnosis. More related studies are required to confirm these inconclusive results.

Extracapsular spread (ECS) is the spread of cancer cells beyond the capsule of a metastatic lymph node into the surrounding tissues. ECS is a common phenomenon in head and neck cancer patients and is considered to be a sign of tumour invasion, metastasis, and a poor prognosis.

Currently, ECS can be visualized using multiple imaging modalities, which help to determine tumour staging, plan treatment, and predict the prognosis. Commonly used ECS imaging methods currently include computed tomography (CT), magnetic resonance imaging (MRI), ultrasonography (US), and positron emission tomography/CT (PET/CT). Although several trials have reported the diagnostic efficacy of these imaging methods, the results have been inconsistent. Furthermore, none of these trials could compare all of these imaging methods at one time in terms of diagnostic test accuracy. Therefore, the present systematic review and meta-analysis was conducted to determine and compare the diagnostic efficacies of some commonly used imaging methods (CT, MRI, US, and PET/CT) in distinguishing ECS secondary to head and neck cancers.

Materials and methods

This systematic review was granted an exemption by the local institutional review board. A protocol was finalized a priori, and the review steps listed below were conducted in compliance with the protocol. The study inclusion, risk of bias assessment, and data extraction were conducted by two independent authors, and any discrepancies were resolved by discussion.

Inclusion criteria

The inclusion criteria were as follows: (1) Study type: studies included were diagnostic test accuracy studies that were designed as cohort studies; (2) Participants: head and neck cancer patients with a diagnosis confirmed by pathology; (3) Index test: different imaging modalities including CT, MRI, PET/CT, and US; (4) Reference standard: pathology; and (5) Outcome: true-positive (TP), false-positive (FP), false-negative (FN), and true-negative (TN), or other data that could help to calculate these four outcomes. All of these data were further used to calculate the sensitivity, specificity, positive likelihood ratio (LR+), and negative likelihood ratio (LR−).

Search strategy

The strategy was to collect all related studies using electronic and manual searches. The following databases were searched electronically: MEDLINE (via OVID, 1948 to February 12, 2015), EMBASE (via OVID, 1980 to February 12, 2015), Chinese Biomedical Literature Database (1978 to February 12, 2015), and China National Knowledge Infrastructure (1994 to February 12, 2015). The grey literature (studies that have not been formally published) was also searched via Science Paperonline (to February 12, 2015). The search strategies were designed using references from the Cochrane Handbook for Diagnostic Accuracy Reviews , draft version 0.4, which suggests a combination of medical subject heading (MeSH) terms and free text words. The MeSH terms used included ‘lymphatic metastasis’, ‘head and neck neoplasms’, ‘ultrasonography’, ‘magnetic resonance imaging’, ‘positron emission tomography’, and ‘computed tomography’, and the key words used included ‘extracapsular spread’, ‘extranodal spread’, ‘extracapsular invasion’, and ‘extranodal invasion’.

Study selection

Two reviewers scanned the titles and abstracts in duplicate to identify any possibly eligible studies. The full texts of the possibly eligible studies were retrieved for additional evaluation. For the study scanning and final inclusion phases, consistency between reviewers was assessed via the kappa value.

Quality assessment

The Quality Assessment of Diagnostic Accuracy Studies tool (QUADAS-2) was used for quality assessment. This includes four domains: patient selection, index test, reference standard, and flow and timing. Each domain was assessed in terms of risk of bias, and the first three domains were also assessed regarding applicability. Signalling questions were included to help determine the risk of bias.

In accordance with the QUADAS-2 instructions, the tool was tailored to the study by omitting two signalling questions: ‘If a threshold was used, was it pre-specified?’ and ‘Did all the patients receive the same reference standard?’

The signalling questions that remained in the QUADAS-2 for this meta-analysis were as follows: (1) Patient selection: Was a consecutive or random sample of patients enrolled? Was a case–control design avoided? Did the study avoid inappropriate exclusions? (2) Index test: Were the index test results interpreted without knowledge of the reference standard results? (3) Reference standard: Was the reference standard likely to correctly classify the target condition? Were the reference standard results interpreted without knowledge of the index test results? (4) Flow and timing: Was an appropriate interval allowed between the index tests and reference standard? Did all the patients receive a reference standard? Were all patients included in the analysis?

All the studies were classified as either ‘A’ low risk of bias, ‘B’ unclear risk of bias, or ‘C’ high risk of bias.

Data extraction

The data extraction form used in this study was similar to that used in a previous systematic review reported by Li et al. The form included the following items: re-evaluation of eligibility, basic information from the study (i.e. authors, title, publication date, and correspondence), participant characteristics (i.e. age, sex, inclusion criteria, tumour types, tumour location, clinical examination of the cervical lymph node, surgery types, number of patients included, and follow-up), study location (i.e. country, source of patients), index test and reference standard (i.e. CT details and pathological diagnosis, diagnostic criteria, blinding and consistency of the radiologists), study design (i.e. study types and study), and outcomes (TP, FP, FN, and TN results).

Meta-analysis

The software Meta-DiSc version 1.4 (the Unit of Clinical Biostatistics team of the Ramón y Cajal Hospital, Madrid, Spain) and Stata version 11.0 (StataCorp LP, College Station, TX, USA) were used to perform the meta-analysis. The studies were pooled when no significant clinical or methodological heterogeneity was found. Slight heterogeneity was detected by meta-regression when the number of studies included exceeded 10. Considering the nature of human cervical lymph nodes, the unit chosen for analysis was a single lymph node, neck level, or individual patient. The measures for diagnostic efficacy were sensitivity, specificity, LR+, LR−, and the diagnostic odds ratio (DOR). A summary receiver operating characteristic (SROC) curve was drawn for each meta-analysis, and the area under the curve (AUC) and Q * (the point on the curve where sensitivity and specificity are equal) were calculated. P < 0.05 was considered to be statistically significant.

When comparing different imaging strategies, GraphPad Prism 5 software was used (GraphPad Software, San Diego, CA, USA). The Z -test was used for pair-wise comparisons, with the following formula: Z = (VAL1 − VAL2)/√(SE1 2 + SE2 2 ), where the variable VAL is the mean sensitivity, specificity, AUC, or Q *, and SE is the standard error of the corresponding variable. For comparisons among all imaging modalities, one-way analysis of variance (ANOVA) was used.

Considering the discrepancies in the presentation of outcome in the different studies included, it was decided to divide the unit of analysis used in the systematic review into ‘neck/node level’ (NL), in which the neck level/lymph node was taken as the unit when detecting ECS, and ‘patient level’ (PL), in which an individual patient was taken as the unit when detecting ECS. This was also because NL introduced many more study objects than the actual number of patients included and this could influence the diagnostic efficacy.

Results

Search and study inclusion

A total of 3971 records were retrieved in the initial search. After screening, 3929 records were excluded and 42 remained for further evaluation. After the full texts had been retrieved and a more detailed assessment performed, 15 studies (15 search records) were finally included ( Fig. 1 ). The consistency between the two reviewers during the study scanning phase and the final inclusion phase was acceptable, with kappa values of 0.89 and 1.00, respectively.

Fig. 1
Flow diagram of study inclusion.

Characteristics of the studies included

A total of 15 studies with 1155 patients and 957 neck levels were involved. All patients underwent CT, MRI, US, or PET/CT and were accounted for in the meta-analysis. The details are listed in Table 1 .

Table 1
Characteristics of the studies included in this meta-analysis.
Study ID Country Study type Number (M/F) Age, years, mean (range) Tumour location Unit Imaging used
Kimura 2008 Japan RS 109 (89/20) 66 (56–76) HP, OP, larynx, oral floor, tongue, gingiva, cheek, NP, palate, unknown sites NL ( n = 140) MRI
King 2004 China RS 17 (16/1) 62.4 (50–85) HP, HP/OP, larynx, tongue, oral cavity NL ( n = 51) MRI
Lodder 2013 Netherlands RS 39 (25/14) 63 (46–85) Tongue, HP, OP, gingiva, thyroid, maxillary sinus, parotid gland, submandibular gland, NP, palate, larynx, nasal cavity, unknown site NL ( n = 60) MRI
Steinkamp 2002 Germany RS 69 58.2 Tongue, floor of mouth, retromolar trigone, cheek, gingiva NL ( n = 79) MRI
Sumi 2011 Japan RS 43 (37/6) 62 (37–82) Larynx, HP, OP, oral floor, tongue, gingiva, buccal mucosa, nasopharynx, palate, unknown sites NL ( n = 54) MRI
Dhanda 2014 UK RS 83 Tongue, floor of mouth, cheek, gingiva PL ( n = 83) MRI
Carvalho 1991 UK RS 28 Thyroid, larynx, NP, HP, OP, tongue, parotid gland, submaxillary gland, maxillary sinus, superior segmental oesophagus, mammary gland NL ( n = 21) CT
King 2004 China RS 17 (16/1) 62.4 (50–85) HP, HP/OP, larynx, tongue, oral cavity NL ( n = 51) CT
Luo 1997 China RS 60 (35/25) 16–75 Thyroid, HP, larynx, oral cavity, sinus, OP, parotid gland NL ( n = 101) CT
Souter 2009 New Zealand RS 127 Tongue, larynx, NP, oral cavity, pharynx, unknown site NL ( n = 149) CT
Steinkamp 1999 Germany RS 165 (136/29) 57.5 Head and neck NL ( n = 97) CT
Chai 2013 USA RS 100 (79/21) 62 (37–89) Head and neck PL ( n = 100) CT
Url 2013 Austria RS 49 (44/5) 60 (49–71) Oral cavity, OP, larynx, skin, unknown site PL ( n = 49) CT
Luo 1997 China RS 60 (35/25) 16–75 Thyroid NL ( n = 76) US
Steinkamp 2003 Germany RS 97 58.2 Thyroid, larynx, NP, HP, OP, tongue, parotid gland, submaxillary gland, maxillary sinus, superior segmental oesophagus NL ( n = 97) US
Chun 2014 Korea RS 89 (80/9) 62.5 (32–91) Larynx NL ( n = 62) PET/CT
Joo 2013 Korea RS 80 (55/25) 54 (23–83) Tongue, floor of mouth, retromolar trigone, cheek, gingiva NL ( n = 71) PET/CT
CT, computed tomography; F, female; HP, hypopharynx; M, male; MRI, magnetic resonance imaging; NL, neck/node level; NP, nasopharynx; OP, oropharynx; PET/CT, positron emission tomography/computed tomography; PL, patient level; RS, retrospective study; US, ultrasonography.

Risk of bias of the studies included

The risk of bias and applicability of the studies included were assessed using the QUADAS-2 tool. All of the studies had good applicability. The results of the risk of bias assessment showed that one study had a high risk of bias and the remaining 14 studies had an unclear risk of bias ( Table 2 ).

Table 2
Risk of bias in the studies included.
Study ID Questions a Risk of bias b Applicability (high or low)
1 2 3 4 5 6 7 8 9
Carvalho 1991 U Y U Y Y U Y Y Y B High
Chai 2013 Y Y Y Y Y U N Y Y C High
Chun 2014 U Y Y Y Y U Y Y Y B High
Dhanda 2014 U Y Y Y Y U U Y Y B High
Joo 2013 Y Y Y Y Y U U Y Y B High
Kimura 2008 Y Y Y Y Y U U Y Y B High
King 2004 U Y U Y Y U U Y Y B High
Lodder 2013 Y Y Y Y Y U U Y Y B High
Luo 1997 U Y U Y Y U Y Y Y B High
Souter 2009 Y Y Y Y Y U U Y Y B High
Steinkamp 1999 U Y U U Y U U Y Y B High
Steinkamp 2002 U Y U U Y U U Y Y B High
Steinkamp 2003 U Y U U Y U U Y Y B High
Sumi 2011 U Y U U Y U U Y Y B High
Url 2013 U Y U Y Y U U Y Y B High
N, no; U, unknown; Y, yes.

a Patient selection: (1) Was a consecutive or random sample of patients enrolled? (2) Was a case–control design avoided? (3) Did the study avoid inappropriate exclusions? Index test: (4) Were the index test results interpreted without knowledge of the reference standard results? Reference standard: (5) Was the reference standard likely to correctly classify the target condition? (6) Were the reference standard results interpreted without knowledge of the index test results? Flow and timing: (7) Was an appropriate interval allowed between the index tests and reference standard? (8) Did all the patients receive a reference standard? (9) Were all patients included in the analysis?

b A, low risk of bias; B, unclear risk of bias; C, high risk of bias.

Evaluation of diagnostic ability

MRI

Six studies reported MRI results. For the node/neck level (NL), MRI had a mean sensitivity of 0.85 (95% confidence interval (CI) 0.80–0.89), specificity of 0.84 (95% CI 0.77–0.90), LR+ of 4.615 (95% CI 2.255–9.447), LR− of 0.191 (95% CI 0.072–0.509), and DOR of 60.270 (95% CI 9.314–390.00) ( Fig. 2 ). The AUC was 0.9454 and the Q * was 0.8844. For the patient level (PL), only one study was included; the sensitivity was 0.08 (95% CI 0.02–0.22) and the specificity was 1.00 (95% CI 0.92–1.00).

Fig. 2
Results of the meta-analysis of magnetic resonance imaging for the diagnosis of extracapsular spread (data are presented as the mean and 95% confidence interval). (A) Sensitivity; (B) specificity; (C) positive likelihood ratio; (D) negative likelihood ratio; (E) diagnostic odds ratio.

The effectiveness of different MRI diagnostic criteria was also assessed ( Table 3 ). Fourteen criteria were gathered. The criterion ‘short-axis diameter >15 mm’ had the highest sensitivity of 0.93. The criteria ‘infiltration of adjacent planes’, ‘time–signal intensity curve (TIC) (>44% nodal area with type 2 TIC pattern)’, and ‘short-axis diameter >25 mm’ had specificities of 100%. The criteria ‘shaggy margin (CET1WI)’, ‘TIC (>44% nodal area with type 2 TIC pattern)’, and ‘short-axis diameter >25 mm’ had the highest accuracies (89%). TIC, which can be obtained from MRI images processed using the software ImageJ (National Institutes of Health, Bethesda, MD, USA) and Mathematica (Wolfram Research, Champaign, IL, USA), was classified automatically on the basis of the increment ratio, the time to peak enhancement ( T peak ), and the washout ratio (WR) into four types (types 1–4). The type 2 TIC pattern, which is used as one of the comprehensive diagnostic criteria for ECS, has the following characteristics: increment ratios greater than 20% and peak times equal to or longer than 120 s.

Jan 16, 2018 | Posted by in Oral and Maxillofacial Surgery | Comments Off on Predicting extracapsular spread of head and neck cancers using different imaging techniques: a systematic review and meta-analysis

VIDEdental - Online dental courses

Get VIDEdental app for watching clinical videos