Develop and test the best model

In Chapter 3, you found out that the following parameters allow you to get better model:

  • max_depth = 8,
  • min_samples_leaf = 150,
  • class_weight = "balanced"

In this chapter, you discovered that some of the features have a negligible impact. You realized that you could get accurate predictions using just a small number of selected, impactful features and you updated your training and testing set accordingly, creating the variables features_train_selected and features_test_selected.

With all this information at your disposal, you're now going to develop the best model for predicting employee turnover and evaluate it using the appropriate metrics.

The features_train_selected and features_test_selected variables are available in your workspace, and the recall_score and roc_auc_score functions have been imported for you.

This exercise is part of the course

HR Analytics: Predicting Employee Churn in Python

View Course

Exercise instructions

  • Initialize the best model using the parameters provided in the description.
  • Fit the model using only the selected features from the training set.
  • Make a prediction based on the selected features from the test set.
  • Print the accuracy, recall and ROC/AUC scores of the model.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Initialize the best model using parameters provided in description
model_best = DecisionTreeClassifier(____=____, ____=____, ____=____, random_state=42)

# Fit the model using only selected features from training set: done
model_best.fit(____, target_train)

# Make prediction based on selected list of features from test set
prediction_best = model_best.____(____)

# Print the general accuracy of the model_best
print(____.score(features_test_selected, target_test) * 100)

# Print the recall score of the model predictions
print(____(target_test, prediction_best) * 100)

# Print the ROC/AUC score of the model predictions
print(roc_auc_score(target_test, ____) * 100)