Using different sets of variables
Adding more variables and therefore more complexity to your logistic regression model does not automatically result in more accurate models. In this exercise you can verify whether adding 3 variables to a model leads to a more accurate model.
variables_1 and variables_2 are available in your environment: you can print them to the console to explore what they look like.
Este exercício faz parte do curso
Introduction to Predictive Analytics in Python
Instruções do exercício
- Fit the
logregmodel usingvariables_2which contains 3 additional variables compared tovariables_1. - Make predictions for this model.
- Calculate the AUC of this model.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create appropriate DataFrames
X_1 = basetable[variables_1]
X_2 = basetable[variables_2]
y = basetable[["target"]]
# Create the logistic regression model
logreg = linear_model.LogisticRegression()
# Make predictions using the first set of variables and assign the AUC to auc_1
logreg.fit(X_1, y)
predictions_1 = logreg.predict_proba(X_1)[:,1]
auc_1 = roc_auc_score(y, predictions_1)
# Make predictions using the second set of variables and assign the AUC to auc_2
logreg.____(____, ____)
predictions_2 = ____.____(____)[____,____]
auc_2 = ____(____, ____)
# Print auc_1 and auc_2
print(round(auc_1,2))
print(round(auc_2,2))