What Is Certainty of the Evidence, and Why Is It Important to Dental Practitioners?

Chapter 14. What Is Certainty of the Evidence, and Why Is It Important to Dental Practitioners?

Alonso Carrasco-Labra, D.D.S., M.Sc., Ph.D.; Olivia Urquhart, M.P.H.; Malavika P. Tampi, M.P.H.; Lauren Pilcher, M.S.P.H.; Jeff Huber, M.B.A.; Anita Aminoshariae, D.D.S., M.S.; Douglas Young, D.D.S., Ed.D., M.S., M.B.A.; Satish S. Kumar, D.M.D., M.D.Sc., M.S.; Carlos Flores-Mir, D.D.S., M.Sc., D.Sc.; and Gordon H. Guyatt, M.D., M.Sc.


Dental practitioners often encounter clinical recommendations addressing the potential effects of a number of interventions. It is important for clinicians to be aware of not only the best estimates of anticipated net benefits, but how much trust they can place in those estimates.1 The need for an explicit assessment of the trustworthiness of the evidence supporting recommendations became apparent in the early 2000s. In fact, by 2002, more than 100 systems to evaluate the quality of clinical research were available.2 This large number of frameworks created confusion, inconsistencies, and frustration for practicing clinicians and patients making health care decisions.3,4

To address this issue, in the early 2000s a group of methodologists and guideline developers created a common, transparent, and sensible process to assess the certainty of the evidence and grade strength of recommendations: the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach.5 GRADE is currently the most accepted approach to assess the certainty of the evidence and formulate recommendations. It has been adopted by more than 100 organizations around the world, including the World Health Organization, Cochrane, the American Academy of Pediatric Dentistry (AAPD), and the American Dental Association (ADA).

When selecting the most appropriate treatments, patients and clinicians have a variety of interventions from which they can choose. Consider, for example, choosing among different nonrestorative treatments for arresting carious lesions.6,7 The dentist’s task is to determine whether the incremental benefit expected from implementing a nonrestorative approach to manage carious lesions is worthy of the additional costs (for example, additional visits, possible progression of carious lesions to a more severe stage requiring more invasive treatment, and additional testing including dental radiographs). In doing so, a dentist considers issues such as the severity of the disease, the patient’s specific characteristics, the most important outcomes informing the decision, the underlying body of evidence informing each outcome, the balance between the desirable and undesirable consequences among all interventions, the certainty of the evidence, and the management of limited resources. Clinical practice guidelines and systematic reviews, presented in a succinct and convenient format, are frequently available to support patients and clinicians in addressing the issues presented above (see Chapters 6 and 7).

In this chapter, we summarize how GRADE can help clinicians to assess the certainty of the evidence in systematic reviews and clinical practice guidelines and inform health care decision-making.

Certainty of the Evidence

In 2016, the ADA and the AAPD published a clinical practice guideline addressing pit-and-fissure sealants on the occlusal surfaces of primary and permanent molars in children and adolescents.8 The guideline panel recommended “the use of sealants compared with nonuse in permanent molars with both sound occlusal surfaces and noncavitated occlusal carious lesions in children and adolescents.” This recommendation was graded as a strong recommendation based on moderate certainty of the evidence. What did the ADA-AAPD guideline panel mean by moderate certainty of the evidence?

In the context of recommendations from clinical practice guidelines (for example, sealants for preventing and arresting carious lesions), the certainty of the evidence reflects “the extent of [the guideline panel’s] confidence that the estimates of an effect are adequate to support a particular decision or recommendation.”9 This means that overall, considering all patient-important outcomes, the panel has moderate certainty that pit-and-fissure sealants applied to occlusal surfaces reduce the incidence of carious lesions sufficiently to outweigh potential downsides, including lack of sealant retention, costs, and other burdens.

When guideline panels need to formulate recommendations, they usually focus on various aspects of care. Questions of therapy are quite common in most guidelines; however, it is not unusual to see questions related to diagnosis,10 screening,11 and prognosis. To inform these decisions, panels identify key patient-important outcomes. The evidence supporting the effect of interventions on these outcomes come from different types of study designs.

In GRADE, evidence from randomized controlled trials (RCTs) starts as high certainty of the evidence; however, serious or very serious limitations in the body of evidence from this type of study design can reduce certainty to moderate, low, or very low. Observational studies, on the other hand, although starting as low certainty, can, under specific circumstances, be rated up to moderate or high certainty.

Criteria to Rate Down the Certainty of the Evidence

Risk of Bias

Guideline panels need to evaluate the extent to which the studies included in the guideline or systematic review may be affected by methodological issues to the point that those flaws may seriously increase the chance of misleading results.12 Different study designs may be associated with different types of bias. Some of the most common limitations in RCTs are 1) an inappropriate randomization strategy, 2) a lack or poor implementation of allocation concealment, 3) a lack of blinding, 4) the presence of important missing participant data, and 5) selective outcome reporting (see Chapters 3 and 13). For observational studies, issues related to eligibility criteria, flawed measurement of exposure and outcomes, inappropriate control for confounding, and incomplete follow-up are among the most common limitations (see Chapters 4, 5, and 13). When using GRADE, panels and reviewers first assess the risk of bias at an individual study level across outcomes and then consolidate the assessment into one judgment that reflects to what extent the methodological flaws across different studies (that is, the body of evidence) informing an outcome were serious enough to merit rating down the certainty of the evidence. In the case of the 2016 ADA-AAPD sealants guideline, the certainty of the evidence for the outcome “caries incidence,” although informed by nine RCTs, was rated down from high to moderate certainty because of the risk of bias stemming from serious concerns about the implementation of allocation concealment and blinding.8,13


When examining a group of studies informing a particular outcome, guideline panels or reviewers are more certain of evidence when results are similar across studies (that is, most studies tell a “similar story”) than when they differ. Differences that cannot be explained are referred to as unexplained heterogeneity in the magnitude of effect from different studies (that is, studies telling “different stories” without finding a plausible reason for those differences).14 When assessing inconsistency, there are four criteria to consider for rating down:

1. Point estimates vary widely across studies, showing dissimilar treatment effects.

2. Confidence intervals (CIs) show minimal to no overlap.

3. Heterogeneity testing proves significant, which means a low P value (<0.1) that rejects the null hypothesis that all studies in the meta-analysis have a similar magnitude of effect.

4. The I2 statistic is large, which means a large proportion of variability in point estimates due to between-study differences. Usually, an “I2 of less than 40% is low, 30–60% may be moderate, 50–90% may be substantial, and 75–100% is considerable.”14

Figure 14.1. Meta-Analysis Comparing Antibiotic Prophylaxis versus Placebo and the Effect on Pain (Presence or Absence) on the Sixth- and Seventh-Day Post-Dental Extraction, Including Subgroup Analysis for Pre-, Post-, and Pre- and Post-Operative Administration


Source: Lodi G, Figini L, Sardella A, et al. “Antibiotics to prevent complications following tooth extractions.” Cochrane Database of Systematic Reviews 2012, Issue 11 ArtNo:CD003811 DOI: 10.1002/14651858.CD003811.pub2.

Consider Figure 14.1. This forest plot compares the effect of antibiotic prophylaxis versus placebo on pain (presence or absence) on the sixth- and seventh-day post-dental extraction. The authors defined a priori hypotheses to explain any heterogeneity that they may have found and conducted subgroup analysis for pre-, post-, and the combination of pre- and post-operative administration of antibiotics.15 When applying the four criteria described above, we see that the point estimates vary across studies (relative risks of 0.97, 0.88, 0.36, and 0.25); there is only a small overlap of CIs; the test for heterogeneity results in a P value of 0.07, which means that chance is unlikely to explain the difference among studies; and the I2 statistic is 57%, suggesting substantial heterogeneity. The authors conducted subgroup analysis and explored whether the antibiotic prophylaxis regimen provided in the different studies (pre-, post-, and the combination of pre- and post-operative administration) can explain the inconsistency. Unfortunately, chance easily explains the apparent differences among the subgroups in Figure 14.1 (P = 0.22), and differences among subgroups are only moderate in magnitude (I2 = 34%), leaving the heterogeneity unexplained and therefore requiring rating down certainty of the evidence for inconsistency.


When assessing imprecision in the context of clinical practice guidelines, the primary criterion is to determine to what extent the true effects represented by the 95% CIs across a set of outcomes lie in a particular range or on one side of a threshold that may incline a guideline panel to conclude that it is worthwhile to recommend for a specific intervention.9 Outcomes can be separated into those that reflect benefit of the intervention relative to a comparator and those that reflect harm. In isolation, each outcome and its 95% CI can inform the effect of a specific intervention, but only for that particular outcome. Guideline panels, clinicians, and patients interested in defining the best course of action for a given condition need to consider all outcomes simultaneously to then determine whether they can establish the net benefit (overall, after looking at both subsets of outcomes, benefits and harms).

For example, the ADA-AAPD clinical practice guideline addressing the use of pit-and-fissure sealants versus nonuse of sealants evaluated the impact of this intervention in reducing caries incidence (desirable outcome) while also addressing sealants’ lack of retention and other adverse effects (undesirable outcomes).8 The systematic review and meta-analysis informing this guideline reported a pooled estimate of absolute reduction on caries incidence (after three years for a moderate-risk population: 30% baseline risk) of 21% in patients receiving sealants, with a 95% CI of 23–19% when compared with patients not receiving sealants. The review also suggests that approximately one in three sealants (30%) will lose retention after 3.5 years of application.13 Sealants are known for presenting no serious adverse effects, minimal burden for patients beyond the expected regular visits for a dental check-up, and costs that seem to be worth the benefit, especially in children and adolescents.8,1618 When we consider imprecision, we ask ourselves how small the benefit would have to be before use of sealants would still outweigh the cost and burden of monitoring?

Let’s assume that a threshold of 10% reduction in caries incidence is the minimum threshold we are willing to accept in order to recommend the use of sealants (that is, if the absolute benefit is less than a reduction of 10%, we would not use sealants). Because the entire 95% CI (23–19%) represents benefits greater than the 10% threshold, excluding a benefit smaller than the threshold, this means that we are confident that the true effect lies above the threshold, the precision of the estimate is sufficient to support a recommendation in favor of using sealants compared with not using them, and rating down because of imprecision is not warranted (Figure 14.2, Scenario 1).


Guideline panels, clinicians, and patients interested in defining the best course of action for a given condition need to consider all outcomes simultaneously to then determine whether they can establish the net benefit.

Figure 14.2. Use of a Threshold to Determine Precision of an Estimate to Inform a Recommendation


Only gold members can continue reading. Log In or Register to continue

Stay updated, free dental videos. Join our Telegram channel

Aug 4, 2021 | Posted by in General Dentistry | Comments Off on What Is Certainty of the Evidence, and Why Is It Important to Dental Practitioners?

VIDEdental - Online dental courses

Get VIDEdental app for watching clinical videos