A meta-analysis, a statistical combination of data from selected studies, is implemented by choosing a priori between 2 popular statistical models: fixed-effect and random-effects models. The choice of the appropriate model for the analysis is critical to ensure the credibility of the results and depends on both the goals of the analysis and the assumptions of the models. In this section, we introduce these 2 models, describe how to perform a meta-analysis under these models, and apply a real-data example from a published systematic review.
Fixed-effect model
Clinical trials aim at assessing the potential benefits and harms of the treatments of interest. We use a measure that represents the size of the benefit (or harm), if a benefit exists, of 1 treatment over the comparator. This measure is called the effect size and quantifies the degree by which 1 therapy differs from the other in a particular outcome. Various ways to measure this effect (absolute and relative) are used, such as the mean difference, the risk difference, the risk ratio (RR), and the odds ratio. Please refer to previous articles discussing effect size in more detail. , Under the fixed-effect model, all studies are assumed to share a common effect size, and the only reason that the observed effect size varies across studies is the within-study variation (only 1 source of variance). Therefore, under this model, all factors that might affect the effect size are assumed to be “fixed” (ie, the same) in all studies. For example, we are interested in identifying studies that examine the clinical effectiveness, such as bond failures, of plasma vs halogen curing lights. To achieve this, we randomly select a number of patients from a specific practice and divide them into 4 groups of various sizes. If we assume that the assignment of the patients to any of the 4 groups has no impact on the difference in bond failures, then all these groups share a common effect size (difference in the risk or odds of bond failures), and any observed differences in the effect sizes (difference in the risk or odds of failure between the 2 curing lights) are due to sampling variations.
Figure 1 presents a forest plot of 4 fictional studies under the fixed-effect model. The summary effect size is denoted by a diamond shape at the bottom of the forest plot. The study-specific observed effect sizes are denoted by squares, and the corresponding true effect sizes are denoted by circles. In each study, the difference between the true and the observed effect sizes indicates the sampling error (also known as the within-study error). The sampling errors indicate differences owing to chance alone among samples drawn from the population of interest. The sampling error occurs because we are only drawing a small sample from the population of interest mostly for practical and financial reasons. Using the entire population would mean 0 sampling error; however, this is usually impossible. To estimate the common effect size, this model mostly uses information from large studies; hence, relatively greater weight is assigned to those studies, whereas relatively less weight is assigned to small studies, largely “ignoring” the information provided by the latter. Therefore, the weighting scheme of this model is based only on the variance of the studies. The steps to perform a fixed-effect meta-analysis in detail are provided in the Supplementary Appendices 1 and 2. ,
Application to real data—binary outcome
Table I presents the summary binary data of the 5 studies included in a meta-analysis of a published systematic review that compares bond failures between plasma (experimental intervention) and halogen curing lights (control intervention). The measure of interest is the RR: the risk of events with plasma curing lights compared with the risk of events with halogen curing lights. The pooled summary effect size is estimated at an RR of 0.90 (95% confidence interval [CI] 0.69-1.19), and it approaches the value 1 of no difference. Although plasma appears to be slightly better than halogen, this difference does not reach statistical significance because the 95% CI includes the value 1. We conclude that the effect of the plasma curing light on bond failures does not differ statistically from the effect of the halogen curing light. The results of this worked example are illustrated in Figure 2 . The individual studies (name of first author and year of publication) are presented on the left side of the forest plot. There is a horizontal line next to each study: the rectangle on the line shows the study’s estimate in relation to the solid vertical line, which represents the line of no difference (RR = 1). The size of the rectangle varies according to the sample size of each study, and it reflects the weight contribution of its study to the summary effect size: the larger the study, the larger its weight, and hence, the larger the rectangle. Here, the weights correspond to the percentages of the total weight attributed to each study. Rectangles falling on the solid vertical line of no difference indicate that the corresponding study did not favor either type of curing light. The whiskers extending from the rectangle indicate the 95% CI of the estimate of that study. Wider whiskers indicate lower precision for the estimate and vice versa. The dotted vertical line indicates the pooled estimate after combining data from all studies. At the bottom of the forest plot, there is a diamond shape that represents the pooled estimate and its CI. On the right side of the forest plot, the actual numeric estimates and 95% CI are shown per study and overall. When the 95% CI includes 1, it indicates that the result is not significant at conventional levels ( P >0.05). To simplify this example, clustering because of multiple teeth bonded within patients was not considered during this analysis. Accounting for clustering would have increased the width of the 95% CIs, and hence the uncertainty of the estimates with no difference in the conclusions presented here.
Study | Plasma | Halogen | ||||
---|---|---|---|---|---|---|
Events | Nonevents | Sample size | Events | Nonevents | Sample size | |
Cacciafesta, 2004 | 21 | 279 | 300 | 12 | 288 | 300 |
Manzo, 2004 | 12 | 292 | 304 | 12 | 292 | 304 |
Pettemerides, 2004 | 12 | 164 | 176 | 13 | 163 | 176 |
Russell, 2008 | 22 | 305 | 327 | 31 | 296 | 327 |
Sfondrini, 2004 | 31 | 686 | 717 | 39 | 678 | 717 |