In the article discussing the chi-square test, I used a clinical trial scenario with the objective of assessing the clinical alignment efficiency of 2 types of wires. These wires (A and B) were used for 6 months in 2 patient groups, and the outcome recorded was binary: reaching complete alignment (success) or not reaching complete alignment (failure).
Table I shows the tabulation of alignment successes and failures for each wire after 6 months of treatment and the calculation of risks and odds of success, and risk and odds ratios of alignment success vs failure.
Wire type | Total | ||
---|---|---|---|
A | B | ||
Alignment | |||
Yes | a = 23 | b = 19 | 42 |
No | c = 8 | d = 11 | 19 |
Total | 31 | 30 | 61 |
How many aligned with A? | |||
Risk = 23/31 = 0.74 | |||
Odds = 23/8 = 2.88 | |||
How many aligned with B? | |||
Risk = 19/30 = 0.63 | |||
Odds = 19/11 = 1.73 | |||
Risk ratio | |||
0.77/0.61 = 1.17 | |||
Odds ratio (OR) | |||
2.88/1.73 = 1.66 |
The chi-square test showed no evidence of a difference in the success of alignment after 6 months between the 2 wire groups; the P value was 0.36.
The same result can be calculated using a special type of regression analysis called logistic regression used when the outcome is binary (alignment: yes/no). Remember linear regression is used when the outcome is continuous (eg, millimeters of crowding alleviation). In logistic regression, we can get effect estimates, P values, and confidence intervals directly from the regression output. In logistic regressions, the effect estimates are handled as log odds ratios (log OR, log = natural logarithm), because they have appropriate mathematical properties (can range from −∞ to +∞). We can convert the log ORs to odds ratios (ORs), which are more interpretable, by exponentiating them (OR = exp[log OR]). Logistic regression has a similar form as the linear regression model in the sense that components (y = a + bx) are linearly related in the logarithmic scale (when using log ORs). However, in logistic regression, the response or dependent variable y is the log odds log(p/1−p), which is called the logit:
where a is the intercept (constant), b is the regression coefficient of x, and x is the categorical predictor, with 2 in our example (wire A or wire B).
Specifically, in the above equation that pertains to the logistic regression model, a is the log odds of reaching alignment in patients in the control group, which we assume here is the group with wire B (reference). In the above equation, b is the log OR of reaching alignment in patients fitted with wire A vs patients fitted with wire B.
In a bit more detail, we have groups A and B, and the risk (proportion) of the event for A is p1, whereas the risk of the event for B is p2. The odds of the event would be p1/(1−p1) for the wire A group and p2/(1−p2) for the wire B group, and their natural logarithms would be log(p1/1−p1) = logit(p1) and log(p2/1−p2) = logit(p2), respectively. Then the OR of the event in group A compared with group B would be
and the logarithm of the OR would be
l o g ( p A / ( 1 − p A ) p B / ( 1 − p B ) ) = l o g ( p A 1 − p A ) − l o g ( p B 1 − p B )
If we use the values 0 and 1 for wires B and A, respectively, and after appropriate substitutions in equation 1 and using equations 2 and 3 , we arrive at the following:
For wire B, log ( p / 1 − p ) = a + b ∗ x = a + b ∗ 0 = a or log ( p / 1 − p ) = a + b ∗ x = a + b ∗ 0 = a or log ( p / 1 − p ) = l o g ( p B 1 − p B ) and for wire A, log ( p / 1 − p ) = a + b * x = a + b * 1 = a + b and b = log ( p / 1 − p ) − a = l o g ( p A 1 − p A ) − l o g ( p B 1 − p B ) = l o g ( p A / ( 1 − p A ) p B / ( 1 − p B ) )