Diagnostic Test: Validity Appraisal

(1)

Department of Neurology Neurosciences Centre, and Clinical Epidemiology Unit, All India Institute of Medical Sciences, New Delhi Delhi, India
 
Abstract
Validity is the extent to which the data are free from bias. The bias can occur in the selection of sample or measurement; therefore, you must ask whether:

Validity Assessment (Is the Information Valid?)

Validity is the extent to which the data are free from bias. The bias can occur in the selection of sample or measurement; therefore, you must ask whether:

1.

Selection bias was avoided (Was the sample selection appropriate?)

(a)

Appropriate spectrums of patients in whom there is a need for a new diagnostic test (or new diagnostic approach)?
 
(b)

Selected in unbiased way (e.g. consecutive cases)?
 
 
2.

Measurement bias was avoided:

(a)

Was there a comparison with an appropriate gold standard?
 
(b)

Blinded measurement: where those doing or reporting the ‘gold standard’ unaware of the test result and vice versa?
 
(c)

No missing data: did everyone who got the test also had the gold standard [no verification bias]?
 
 

Q.1. Was the Sample Selection Appropriate?

(a)

Appropriate spectrum of patients in whom there is a need for a new diagnostic test (or approach)?
You need a new diagnostic test to distinguish a disease (early as well as late) from other diseases with similar symptoms.
 
(b)

Selected in unbiased way: consecutive patients fulfilling the entry criteria, with symptoms and signs common to both cases and non-cases ought to be included in the study.
 
1A. Why do we ask this question?
There are many studies that include florid cases and asymptomatic volunteers. Such studies cannot tell you whether the test is useful. Probably you do not need a test to distinguish florid cases from normal people. Do you need a test to distinguish morbid obesity from normal weight. Probably not. They will be obvious to your eyes. You need to know the performance of the test in patients with diseases commonly confused with the disease you want to diagnose (also called ‘target disorder’). However, you will come across many studies with florid cases and normal controls. Such studies can tell you whether the test is useless. If the test cannot perform as well as your eyeballs, then obviously the test is useless. Investigators often carry out such studies at some stages of development of the test. Such studies cannot tell you the performance of the test in clinical practice.
1B. How do we answer the question?
Read the methods section of the paper to find out what criteria were used for inclusion of patients in the study. Determine whether the patients so included represent (a) the disease spectrum in whom a new test is needed and also (b) the diseases commonly confused with the disease to be diagnosed (target disorder).
1C. How do we interpret the answer?
If there was only one set of eligibility (entry) criteria and it covered both cases and non-cases, then the patients are likely to suffer from commonly confused diseases. Sometimes, authors do not mention entry criteria but present the criteria for final diagnoses of cases and non-cases. In such papers, you have to decide whether in clinical practice, there is confusion between the cases and non-cases and whether there is a need for a test to distinguish them from one another.

Q.2. Was There a Comparison with an Appropriate Gold Standard?

An appropriate gold standard is the one which is ‘error-free’ and independent (distinct or separate) from the test under evaluation.
2A. Why do we ask this question?
The only way to evaluate the correctness of the results of the test under evaluation is to compare with something, which is never wrong. This something is called the reference standard or gold standard. This means that the gold standard is never false positive or false negative. It is 100 % sensitive and 100 % specific. However, such an ideal ‘gold standard’ is hardly ever available. You may have to accept something less than ideal as a reasonable gold standard. The purpose of the gold standard is to tell you the truth – did the patients have the disease or not, when the test was performed. Sometimes authors use more than one ‘gold standard’ to know whether the patients at the time of testing had the disease or not. For example, if you are evaluating exercise electrocardiography, you may use coronary angiography for exercise test positives and only ‘follow-up’ for exercise test negatives. If in 5-year follow-up, they do not develop any symptom suggestive of coronary artery disease, you may conclude that they did not have the disease at the time of exercise test, and hence, the test was true negative.
One way the gold standard may go wrong is if the test is a part of it. For example, if you are evaluating cardiac enzymes for diagnosing myocardial infarction and you use the WHO criteria as the gold standard, which contains cardiac enzymes as its component, then there is a problem. Even when cardiac enzymes are not right, you may take them as right because the WHO criteria include them. Such a gold standard is not taken as ‘independent’ or distinct (separate) from the test. Whenever this happens, you get overoptimistic results for the test. Therefore, you need to ask whether the gold standard is independent of the test. Example of independent gold standard will be technetium scan for acute myocardial infarction, because it does not depend on cardiac enzymes for diagnosis of MI. Similarly in comparison of CT and MR for brain lesion, MR is independent of CT.
2B. How do we answer the question?
Only gold members can continue reading. Log In or Register to continue

Oct 18, 2015 | Posted by in General Dentistry | Comments Off on Diagnostic Test: Validity Appraisal
Premium Wordpress Themes by UFO Themes