Get startedGet started for free

Identify and interpret churn drivers

1. Identify and interpret churn drivers

Great job! Let's now use these models to extract insights into the main drivers of churn.

2. Plotting decision tree rules

Decision trees are neat because they just are list of nested if-else rules that can be plotted. Here's an example of a 3-level decision tree. To plot it, we have to import the tree module from sklearn, and the graphviz module which needs to be installed separately if you want to use it on your own machine. Then, we export a graphviz object by passing the fitted decision tree object, column names, precision level, class names, and whether to fill the leaves with color. Finally, we call the Source method from the graphviz module, pass the exported object, and then call the display function on it.

3. Interpreting decision tree chart

The result is a good looking decision tree visualization. You can interpret it as a set of if-else rules starting from the top. The first row in each leaf is the rule that is then branched by whether or not it is met. We can see the True and False labels on the arrows flowing from the parent leaf to the child leaves. We can see that customer tenure is the most important variable. If the tenure is lower than 11.5, and the customer has no fiber optic Internet service, then it is very likely that customer will churn. The tree can be built with more layers, and this will give more insight into other variables driving churn.

4. Logistic regression coefficients

Now, with logistic regression we get coefficients. The coefficients can be interpreted as the change in log-odds of the churn associated with 1 unit increase in the input feature value. For example if the input feature is tenure in years, then increase in the tenure by one year will have an effect equal to the coefficient to the log-odds. Here's the formula outlining the model equation. What's the main challenge? Log of odds is incredibly hard to interpret.

5. Extracting logistic regression coefficients

Before we begin with interpretation, let's see how to extract the coefficients. We can use the coef underscore method on the fitted logistic regression instance. With that, we get a list of beta coefficients for each input variable.

6. Transforming logistic regression coefficients

The challenge with the previous list is two-fold: First, the coefficient values come without names, and second, the coefficients are in the log-odds scale which is difficult to interpret. The solution is to calculate the exponent of the coefficients, which will give us the change in the actual odds associated with 1 unit increase in the feature value. In Python, we first want to build a pandas dataframe with columns names and coefficients, and name them accordingly. Once that is done, we calculate the exponent of the coefficients, and store them in a separate column. Once that is completed, we extract the non-zero coefficients and print them sorted by the largest Coefficient values first.

7. Meaning of transformed coefficients

Then we get this view. It does not account for statistical significance which we will assess in the next chapter. We can see that the feature with the largest effect on the odds of churning is tenure which is consistent with the findings from the decision tree. The interpretation of the coefficient for odds is as follows - values less than 1 decrease the odds, and values more than 1 increase the odds. The effect on the odds is calculated by multiplying the exponent of the coefficient. So the effect of one additional year of tenure decreases the odds of churn by 1 minus 0.403. This translates to roughly 60% decrease in the churn odds.

8. Let's practice!

Perfect, we have tackled a complex topic, let's check our knowledge of it!