Understanding output from logistic regression

1. Understanding output from logistic regression

In the previous sections, you learned about understanding the results from a Poisson regression. In this section, you will learn about understanding the results from a logistic regression.

2. Communicating results from logistic regression?

Here's a review of how to examine the results from the two other models that we have covered. From a linear model, we can simply add intercepts or multiply slopes by predictors and then add to intercepts to get expected values. For a Poisson regression, we multiply rather than add. Then, we take the exponential to convert the results. However, despite needing to multiply and take an exponential, the resulting expected values are similar to a linear regression when we try to understand them. However, what do we do with the results from a logistic regression?

3. Odds-ratios

Unlike the Poisson regression, we cannot transform the results from a logistic regression to an easy to understand linear term. However, we can convert the coefficients to something call odds-ratios. These are used to compare the relative odds of two events occurring.

4. Example odds-ratios

For example, if we had an unfair coin, we might compare the odds of heads to the odds of tails. If we expected to get 3 heads for every 1 tails we would have 3-to-1 odds. This would be an odds ratio of 3.0. These are often used in sports and gambling, where the odds of failing are often reported. For example, the underdog team might have a 10-to-1 odds for winning. Odds-ratios are also commonly used in the medical literature, as you will see an example in a few of slides.

5. Logistic derivation of odds-ratio

Mathematically, we may derive odds-ratios. We start with the "log-odds", which is abbreviated as logit and is the link function. The log odds are the natural log of the probability of an event occurring divided by 1 minus the same probability. This is set equal to our linear regression terms. To get the odds, we take the exponential. Next, we can calculate the odds-ratios using the odds.

6. Odd-ratio for continuous variable

For example, with a continuous variable, we take the probability of x-plus-1 divided by the probability of x occurring, which simplifies to the exponential of beta-1. A similar derivation may be done for discrete, intercept predictors.

7. Interpretation

Understanding odds ratios is relatively straight forward. If the odds-ratio is 1, the coefficient has no effect. If the odds-ratio is <1, the coefficient predicts a decreased chance of an event occurring. Conversely, if the odds-ratio is >1, the coefficient predicts an increased chance of an event occurring.

8. Cancer example

For example, a study by Pesch and others looked at the odds-ratios of getting cancer. In smoking versus non-smoking males, they found the odds-ratio of getting cancer to be ~103 for smokers. Thus, smokers had 100-to-1 odds of getting cancer compared to non-smokers! Also, notice how the medical literature often reports confidence intervals rather than p-values. This is part of a broader trend away from p-values in some parts of science.

9. Extract from GLM

We can extract confidence intervals from GLMs using base R, much like for the Poisson GLM. First, we extract the coefficient using the coef() function and then we take the exponential. Likewise, we can do the same thing for confidence intervals using the confint() function.

10. Tidyverse

Using the tidy function in the broom package, we can quickly and easily get both the exponentiated coefficients and also the confidence intervals. We set the exponentiate to TRUE as well as conf.int to TRUE as well.

11. Let's practice!

Now, you will get to look at odds-ratios in R!