In the article discussing the chi-square test, I used a clinical trial scenario with the objective of assessing the clinical alignment efficiency of 2 types of wires. These wires (A and B) were used for 6 months in 2 patient groups, and the outcome recorded was binary: reaching complete alignment (success) or not reaching complete alignment (failure).
Table I shows the tabulation of alignment successes and failures for each wire after 6 months of treatment and the calculation of risks and odds of success, and risk and odds ratios of alignment success vs failure.
|Yes||a = 23||b = 19||42|
|No||c = 8||d = 11||19|
|How many aligned with A?|
|Risk = 23/31 = 0.74|
|Odds = 23/8 = 2.88|
|How many aligned with B?|
|Risk = 19/30 = 0.63|
|Odds = 19/11 = 1.73|
|0.77/0.61 = 1.17|
|Odds ratio (OR)|
|2.88/1.73 = 1.66|
The chi-square test showed no evidence of a difference in the success of alignment after 6 months between the 2 wire groups; the P value was 0.36.
The same result can be calculated using a special type of regression analysis called logistic regression used when the outcome is binary (alignment: yes/no). Remember linear regression is used when the outcome is continuous (eg, millimeters of crowding alleviation). In logistic regression, we can get effect estimates, P values, and confidence intervals directly from the regression output. In logistic regressions, the effect estimates are handled as log odds ratios (log OR, log = natural logarithm), because they have appropriate mathematical properties (can range from −∞ to +∞). We can convert the log ORs to odds ratios (ORs), which are more interpretable, by exponentiating them (OR = exp[log OR]). Logistic regression has a similar form as the linear regression model in the sense that components (y = a + bx) are linearly related in the logarithmic scale (when using log ORs). However, in logistic regression, the response or dependent variable y is the log odds log(p/1−p), which is called the logit:
where a is the intercept (constant), b is the regression coefficient of x, and x is the categorical predictor, with 2 in our example (wire A or wire B).
Specifically, in the above equation that pertains to the logistic regression model, a is the log odds of reaching alignment in patients in the control group, which we assume here is the group with wire B (reference). In the above equation, b is the log OR of reaching alignment in patients fitted with wire A vs patients fitted with wire B.
In a bit more detail, we have groups A and B, and the risk (proportion) of the event for A is p1, whereas the risk of the event for B is p2. The odds of the event would be p1/(1−p1) for the wire A group and p2/(1−p2) for the wire B group, and their natural logarithms would be log(p1/1−p1) = logit(p1) and log(p2/1−p2) = logit(p2), respectively. Then the OR of the event in group A compared with group B would be
and the logarithm of the OR would be
l o g ( p A / ( 1 − p A ) p B / ( 1 − p B ) ) = l o g ( p A 1 − p A ) − l o g ( p B 1 − p B )
For wire B, log ( p / 1 − p ) = a + b ∗ x = a + b ∗ 0 = a or log ( p / 1 − p ) = a + b ∗ x = a + b ∗ 0 = a or log ( p / 1 − p ) = l o g ( p B 1 − p B ) and for wire A, log ( p / 1 − p ) = a + b * x = a + b * 1 = a + b and b = log ( p / 1 − p ) − a = l o g ( p A 1 − p A ) − l o g ( p B 1 − p B ) = l o g ( p A / ( 1 − p A ) p B / ( 1 − p B ) )