Chapter 5. How to Appraise and Use an Article about Diagnosis
Romina Brignardello-Petersen, D.D.S., M.Sc., Ph.D.; Alonso Carrasco-Labra, D.D.S., M.Sc., Ph.D.; Michael Glick, D.M.D.; Gordon H. Guyatt, M.D., M.Sc.; and Amir Azarpazhooh, D.D.S., M.Sc., Ph.D.
In This Chapter:
Clinical Questions of Diagnosis
What Study Design Best Addresses Questions About Diagnosis?
• How Serious Is the Risk of Bias?
Introduction
In the previous chapters in this book, we introduced the process of evidence-based dentistry (EBD)1 and explained how to search for evidence to inform clinical practice2 and how to use articles about therapy or prevention3 and harm.4 In this chapter, we explain how to use an article to inform clinical decisions regarding questions of diagnosis. We introduce and describe the basic concepts needed to understand diagnostic test studies, and we explain how to use the concepts to appraise such studies critically. In subsequent chapters we will describe how to use other types of study designs.
Box 5.1. Clinical Scenario
During a meeting of the clinicians in your practice, a colleague raises the idea of acquiring a new laser device to help with the diagnosis of caries. This clinician explains that a sales representative described how this device was excellent for detecting noncavitated occlusal carious lesions, and your colleague wants to know your opinion. Because you believe that evidence is necessary to inform this important decision for your practice, you tell your colleague that you need to conduct a literature search and a critical appraisal to inform your opinion.
Clinical Questions of Diagnosis
Dentists face diagnosis questions every day. With most patients, dentists need to use diagnostic tests before they can establish a course of action to follow. In the context of everyday practice, a diagnostic test can refer to any test performed in a laboratory or any information obtained from a medical history or clinical examination that is used to confirm or rule out a specific diagnosis.5 Dental radiographs are a common diagnostic test used by clinicians in many dental specialties, but other procedures such as performing a vitality test and measuring probing depths can be considered diagnostic tests as well.
When facing diagnosis questions, clinicians need to modify the classic Population, Intervention, Comparison, Outcome (PICO) framework for stating questions. The population is the patients of interest (that is, those to whom we will apply the diagnostic test when there is suspicion of a condition or disease). The intervention is the diagnostic test (that is, the test in which we are interested in learning). The comparison is a test we use as a reference to compare the diagnostic test against—this is called the reference standard (or gold standard).6 Finally, the outcomes are either the health consequences after using the diagnostic test or the measures that describe the performance of a diagnostic test. Both of these cases are described later in this chapter. Table 5.1 shows examples of diagnostic test questions and their PICO components.
What Study Design Best Addresses Questions About Diagnosis?
Clinicians can answer diagnostic questions by conducting studies using one of the following two types of designs: randomized clinical trials and cross-sectional studies. Ideally, a study’s investigators would treat a diagnostic test as an intervention. Researchers would randomize patients to receive one of two diagnostic strategies, which for the purposes of this chapter we will call strategy A and strategy B. Clinicians would manage patients according to the results of the test, including providing whatever interventions they think might be appropriate on the basis of test results. Ultimately, they would measure patient-important outcomes in the group whose participants received test strategy A and the group whose participants received test strategy B.
For example, when assessing how useful laser fluorescence is for detecting early interproximal carious lesions, researchers should randomly assign patients to undergo diagnosis with either laser fluorescence or bite-wing radiographs. Then clinicians would treat patients according to the results of the test the patients received, either laser fluorescence or bite-wing radiographs. The investigators would follow up with all patients to determine, for example, how many participants in each group have carious lesions extending into the dentin and what is each participant’s need for restorations (outcomes). To date, we have not been able to identify any of these study designs in the dental literature, and therefore, they will not be further covered in this chapter.
Clinicians can answer diagnostic questions by conducting studies using one of the following two types of designs: randomized clinical trials and cross-sectional studies. |
In studies whose investigators address the accuracy of a diagnostic test, a group of patients undergo both tests (that is, the diagnostic test and the reference standard). The reference standard is considered to be the way to know whether the disease or condition is truly present or absent. The investigators compare the results of the diagnostic test with the reference standard as a way to determine the diagnostic properties of the diagnostic test.
For example, with the aim of assessing the accuracy of thermal and electrical dental pulp tests to diagnose pulp vitality, Villa-Chavez and colleagues7 conducted the cold, hot, or electrical pulpal tests (three different diagnostic tests) in 110 patients. They used as the reference standard the direct observation of the pulp after opening the pulp chamber, and then they estimated the sensitivity, specificity, predictive values, and accuracy of each of the diagnostic tests by comparing the results with those of the reference standard.
Ideally, clinicians would have available the results of systematic reviews of primary studies addressing test properties. We found that few such systematic reviews have been published in the dental literature. In the absence of reviews, clinicians look to the results of the best single primary diagnostic studies to inform their practice. In this chapter, we address such studies; in subsequent chapters, we will describe how to use systematic reviews.
During your search, you found a primary study8 that seems to answer the question at hand. The investigators of the study8 addressed whether using Diagnodent, a laser fluorescence device, was accurate for diagnosing noncavitated occlusal carious lesions extending to the dentin. You read the abstract of this study in which the researchers compared the diagnoses they obtained by using the laser fluorescence device with the diagnoses they obtained by doing an enameloplasty and by observing the carious lesions directly. The authors claimed that “the laser device had an acceptable performance, this equipment should be used as an adjunct method to visual inspection to avoid false positive results.”8 To find out whether the methods and results support the authors’ conclusion, you retrieve the article and begin a critical appraisal.
Critically Appraising a Study Assessing the Properties of a Diagnostic Test to Inform Clinical Decisions
The process of using an article from the dental literature involves three steps: an assessment of the risk of bias, an assessment of the results themselves, and an assessment of the applicability of the results.9
How Serious Is the Risk of Bias?
The extent to which a study’s results are likely to be correct for the sample of patients enrolled depends on how well the study was designed and conducted.10 Factors to consider in judging the risk of bias of diagnostic test studies include whether any of the patient’s conditions presented a diagnostic dilemma, whether the reference standard was appropriate and independent from the diagnostic test, whether the investigators independently interpreted the results of both tests and did not know the results of the other investigators, and whether all patients underwent both the diagnostic test and the reference standard irrespective of the results of the diagnostic test.11 Table 5.28,12–14 lists questions that address the risk of bias associated with diagnostic tests used in studies and presents examples from the dental literature.
The process of using an article from the dental literature involves three steps: an assessment of the risk of bias, an assessment of the results themselves, and an assessment of the applicability of the results.9 |
Question |
Examples |
Explanation |
Did participating patients present a diagnostic dilemma? |
“135 patients (161 impacted teeth) . . . who underwent additional examination by cone-beam CT* because of panoramic features suggesting a close relationship of the tooth root to the mandibular canal were included.”12 “Each surface had to meet one of the three listed criteria to be included in the study: macroscopically intact occlusal fissure that exhibited absolutely no signs of caries; occlusal fissure with a discolored, brown or black area at the clinical examination without cavitation; grey discoloration from the underlying dentin without enamel breakdown.”8 |
In both examples, the authors indicated that patients included in the study had characteristics that presented a diagnostic dilemma, such as features suggesting proximity of the third-molar root to the alveolar canal, or occlusal appearances representing a spectrum of patients, ranging from those who seemed healthy to those who seemed diseased. Therefore, in both of these examples, there is a low risk of bias on the basis of this criterion. |
Did investigators compare the results of a test with an appropriate, independent reference standard? |
“Panoramic and cone-beam CT features were correlated with the intraoperative findings, that is, the presence or absence of the inferior alveolar neurovascular bundle exposure at the time of extraction.”12 “All teeth included in the study underwent endodontic treatment, and the presence or absence of bleeding pulp in the pulp chamber on access was used as a true positive or true negative.”13 |
In the first example, the reference standard was the direct observation of the inferior alveolar neurovascular bundle during surgery, which most clinicians would agree is the best method to diagnose proximity of the third-molar roots with the alveolar canal. In the second example, the clinician must judge how appropriate it is that only bleeding was considered as a sign of pulp vitality and whether this may have affected the results of the study. In both cases, the reference standard was independent from the diagnostic test, which was the cone-beam CT scan and thermal tests, respectively. |
“LFE† scores were analyzed using the manufacturer’s cutoff points, taking into account the absence of a histological examination and the in vivo nature of the study. VE‡ and RE§ data were analyzed using the ICDAS II¶ method (Table 5.3), and modified criteria were validated in vivo by the method of . . .”14 |
When the authors described the way in which the tests were interpreted, they did not mention whether this was done by different clinicians or by the same clinician in a blinded fashion. Therefore, clinicians should consider the potential for bias owing to this factor. |
|
Did investigators perform the same reference standard with all patients regardless of the results of the test under investigation? |
“The validation method for diagnosis (gold standard) was determined by fissure eradication or enameloplasty using an invasive fissure sealing kit. . . . However, not all fissures could be validated as this is an invasive method. Thus, for ethical reasons, opening of the cavities occurred only in cases when both examiners agreed to the presence of dentin caries.”8 |
The authors mentioned not applying the reference standard to some patients, which is not appropriate (that is, it seems likely that the authors assumed that those patients were healthy). There is a risk that the investigators misclassified those patients for whom a substandard reference was used, which could have biased the results. Because this misclassification was done owing to ethical reasons, the clinician should judge how likely it is that bias could have occurred and what the magnitude of this bias could have been. |
* CT = computed tomography.
† LFE = laser fluorescence examination.
‡ VE = visual examination.
§ RE = radiographic examination.
¶ ICDAS = International Caries Detection and Assessment System.
|
Calculation |
Interpretation |
Likelihood Ratio for Positive Test (LR+) |
When the result of the diagnostic test is positive, the probability of having the disease is high. The exact probability of having the disease depends on the pretest probability and the specific LR+ result.† |
|
Likelihood Ratio for Negative Test (LR-) |
When the result of the diagnostic test is negative, the probability of having the disease is low. The exact probability of having the disease depends on the pretest probability and the specific LR- result.† |
* Described on the basis of numbers from Figure 5.2.
† The pretest probability is specific to individual patients and is estimated by the physician on the basis of the patient’s medical history and clinical examination results. The shift from pretest to posttest probabilities can be estimated easily using an LR nomogram.11,17
Did participating patients present a diagnostic dilemma?
Researchers performing diagnostic test studies should select patients who are representative of those to whom the test would be applied in clinical practice. For the results of a diagnostic test to be useful, the results have to discriminate between patients who have and patients who do not have the target condition (for example, in our scenario, the target condition was noncavitated occlusal caries lesion extending to the dentin) when there is a diagnostic dilemma. If patients clearly had the target condition, or clearly did not, there would not be a need to apply the diagnostic test, and in the setting of a study, the accuracy of the test would be overestimated. For example, if a patient had a cavitated mesio-occlusal carious lesion, the clinician would have no need to use a laser fluorescence device to confirm the presence of a carious lesion. In this case where it is so obvious that the carious lesion is present, the device will result in a correct diagnosis most (if not all) of the time. Therefore, for a diagnostic test study to provide a trustworthy assessment of the value of the test, researchers must include patients with early manifestations of the disease, in whom there is doubt regarding the diagnosis, similar to the patients whom clinicians will see in their daily practices.11
Did investigators compare the test with an appropriate, independent reference standard?
An appropriate reference standard is also a key aspect in assessing the risk of bias of diagnostic test studies. As described previously, clinicians assess the properties of the diagnostic test by comparing the results with the reference standard, which is considered to be the truth.5,11 Clinicians should consider two aspects when assessing whether the reference standard is appropriate. First, the reference standard should be the test that is accepted most widely as the definitive test to establish a diagnosis.5 For example, to diagnose oral squamous cell carcinoma, the reference standard is a biopsy and histologic confirmation. Sometimes, however, using the reference standard is only the best available method for diagnosing a target condition instead of the ideal method. For example, to identify a carious lesion extending into the dentin, the reference standard would be to use a combination of clinical signs and radiograph images, as opposed to extracting the tooth and performing a histologic confirmation. Considering this example, the clinician should judge whether the reference standard used in the study is an acceptable way to arrive at a definitive diagnosis.
Second, the reference standard should be independent from the diagnostic test. This means that the diagnostic test should not be part of the reference standard. For example, if the diagnostic test used to diagnose pulpal vital status was a cold test, and the reference standard was a combination of responses from thermal and electrical tests, the diagnostic test would not be independent from the reference standard. Including the test as part of the reference standard leads to overestimation of test accuracy.11
Were the investigators who interpreted the test and reference standard blinded to the other results?
Another factor to consider when appraising the risk of bias of a diagnostic test study is whether the investigators who interpreted the results of the diagnostic test and the reference standard were blinded to the results of the other test.11 Many tests, such as radiographs or histologic confirmation, require interpretation by specialists. If the investigators who interpreted the diagnostic test or reference standard were aware of the results of the other test, they may have been influenced subconsciously when they interpreted the results of the diagnostic test or reference standard. Again, the result would be an overestimate of the accuracy of the diagnostic test.
Many tests, such as radiographs or histologic confirmation, require interpretation by specialists. |
Did investigators apply the same reference standard to all patients regardless of the results of the test under investigation?
Finally, it is important that researchers apply the same reference standard to all patients, irrespective of the results obtained with the diagnostic test.11 Researchers could overestimate the accuracy of the diagnostic test if only those patients diagnosed as “target positive” by the results of the diagnostic test undergo confirmation with the reference standard, because they would not detect patients wrongly classified as “target negative” by the results of the diagnostic test. For instance, in the example in Table 5.2,8 some patients who tested negative in the diagnostic test did not undergo the reference standard, and therefore, it is possible that some of these patients had lesions but nevertheless were classified as not having lesions. Again, such misclassification will make the test look more accurate than it really is.
Box 5.3. Your Assessment of the Risk of Bias of the Study You Identified
With respect to the patients who had diagnostic dilemmas in the study you reviewed,8 it seems likely that only patients whose teeth showed signs of possible carious lesions were included in the study. With respect to applying the reference standard, even though the investigators used the best method (that is, enameloplasty and direct observation of the carious lesion), they did not apply this method to all patients, but rather only to those patients for whom they highly suspected having the diagnosis. It is also not clear whether the investigators who performed the test were blinded to the results of the reference standard. Even though, owing to ethical reasons, you determined that it could have been appropriate not to apply the reference standard to all patients, this omission may have led to an overestimation of the performance of the diagnostic test (see the Supplemental Table8 at the end of this chapter for more details).