1. Learn
  2. /
  3. Courses
  4. /
  5. Introduction to Regression with statsmodels in Python

Exercise

Exploring the explanatory variables

When the response variable is logical, all the points lie on the \(y=0\) and \(y=1\) lines, making it difficult to see what is happening. In the video, until you saw the trend line, it wasn't clear how the explanatory variable was distributed on each line. This can be solved with a histogram of the explanatory variable, grouped by the response.

You will use these histograms to get to know the financial services churn dataset seen in the video.

churn is available as a pandas DataFrame.

Instructions 1/2

undefined XP
  • 1

    In a sns.displot() call on the churn data, plot time_since_last_purchase as two histograms, split for each has_churned value.

  • 2

    Redraw the histograms using the time_since_first_purchase column, split for each has_churned value.