Session Ready
Exercise

Building the AUC curves

The forward stepwise variable selection procedure provides an order in which variables are optimally added to the predictor set. In order to decide where to cut off the variables, you can make the train and test AUC curves. These curves plot the train and test AUC using the first, first two, first three, … variables in the model.

In this exercise you will learn to plot these AUC curves. The method auc_train_test to calculate the AUC values has been implemented for you and can be used as follows:

auc_train, auc_test = auc_train_test(variables, target, train, test)

where variables is the set of variables used in the logistic regression model, target is a list with the target name, and train and test are the train and test basetable respectively.

The variables ordered according to the forward stepwise procedure are given in the list variables. You can explore it in the console. Additionally, three empty lists have been defined for you:

  • auc_values_train, which will contain the train AUC values of the model at each iteration
  • auc_values_test, which will contain the test AUC values of the model at each iteration
  • variables_evaluate, which will contain the variables evaluated at each iteration
Instructions
100 XP
  • Iterate over the variables.
  • In each iteration, add the next variable in variables to variables_evaluate.
  • In each iteration, calculate the train and test AUC using the auc_train_test method. The dataframes train and test contain the train and test data respectively.
  • In each iteration, add the calculated values to auc_values_train and auc_values_test