LoslegenKostenlos loslegen

Timeline violation

To illustrate the importance of the timeline, consider an example where you violate the timeline and use information from the target period to construct the predictive variables.

There are two columns in the pandas dataframe basetable: "amount_2017" is the total amount of donations in 2017, and "target" is 1 if this amount is larger than 30 and 0 else.

Construct a logistic regression model that uses "amount_2017" as single predictive variable to predict the target, and calculate the AUC.

Diese Übung ist Teil des Kurses

Intermediate Predictive Analytics in Python

Kurs anzeigen

Anleitung zur Übung

  • Create a dataframe X that contains the predictive variable and a dataframe y that contains the target.
  • Fit the logistic regression model such that y is predicted from X. Construct a logistic regression model that uses amount_2017 as single predictive variable and predicts target.
  • Make predictions for the objects in X.
  • Calculate and print the AUC of this model using the function roc_auc_score.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Select the relevant predictors and the target
X = basetable[["____"]]
y = basetable[["____"]]

# Build the logistic regression model
logreg = linear_model.LogisticRegression()
logreg.____(____, ____)

# Make predictions for X
predictions = logreg.____(____)[:,1]

# Calculate and print the AUC value
auc = ____(____, ____)
print(round(auc, 2))
Code bearbeiten und ausführen