1. Interpreting coefficients
Having now quite a theoretical and computational knowledge we dive into the model results starting with interpreting model coefficients.
2. Model coefficients
Recall our horseshoe crab model in which we are predicting the probability of a satellite given weight. The value of the estimated coefficients for each model variable, including the intercept, is provided in the coef column. What do these values represent?
3. Coefficient beta
The coefficient beta determines the rate of increase or decrease of the sigmoid curve. In the wells example, a model with arsenic estimated a positive coefficient with an ascending sigmoid curve. On the contrary, a model with the distance variable has negative coefficient with a descending sigmoid curve.
4. Linear vs logistic
Generally, it is challenging to interpret coefficients of logistic regression due to nonlinearity. It turns out that the interpretation of the coefficients for the logistic regression is the same as for linear models except that in logistic regression coefficients are in terms of the log odds. For example, using the crab data we could fit a linear model and the logit model obtaining the model formula. In linear model, for every one-unit increase in weight, the estimated probability increases by 0.32. In the logit model for every one-unit increase in weight the log odds increase by 1.8. Clearly, the two interpretations are not the same. But what does an increase in log odds actually mean?
5. Log odds interpretation
Starting from the logistic model assume a one-unit increase in x denoted in blue.
6. Log odds interpretation
Expanding the parenthesis we obtain the following expression. To obtain the odds we take the exponential and rearrange the terms. Finally, we see that the odds are multiplied by the exponential of the coefficient. Let's see this on our crab example.
7. Log odds interpretation
Given the weight coefficient of 1.815 the odds of satellite crab multiply by 6.14 for every unit increase in weight.
8. Log odds interpretation
The intercept value of -3.6947 provides the baseline log odds, by assuming zero values for the weight variable. Sometimes it is more natural to think in terms of probabilities than log odds.
9. Probability vs logistic fit
We will shortly revisit our study example from the last video to illustrate how the estimated probability changes as hours of study change? The curve is nonlinear so the rate of change in probability per 1-unit increase in hours of study will depend on the value of hours.
10. Probability vs logistic fit
Starting from hours 1, 2 or 6,7 and increasing it by one unit does not change the estimated probability by much. However,
11. Probability vs logistic fit
starting from 4, 4.5 increases the estimated probability significantly. To compute this rate of change at a particular value of x, we compute the slope of the tangent line at the value of x. For the coefficient beta, this slope is equal to beta times probability times one minus probability.
12. Probability vs logistic fit
Generally, the steepest slope occurs at the point where the probability mu equals 0.5.
13. Compute change in estimated probability
Considering the horseshoe crab model, we compute the rate of change in the estimated probability given a one-unit change in weight. First, we choose the value of weight for which to make the computation and extract model coefficients. Then using the probability formula from our logit model we compute the estimated probability. Finally, we compute the incremental rate of change of the estimated probability. Hence, adding 1kg in weight when weight is 1.5 corresponds to a positive difference in the probability of satellite crab of 36%.
14. Rate of change in probability for every x
Continuing computation for other values of x we obtain a figure like this, from which we can see that the biggest increase in the estimated probability is around the median value of x.
15. Let's practice!
Now let's practice!