Balancing classes
It can significantly affect prediction results, as shown by the difference between the recall
and accuracy
scores. To solve the imbalance, equal weights are usually given to each class. Using the class_weight
argument in sklearn
's DecisionTreeClassifier
, one can make the classes become "balanced"
.
Let’s correct our model by solving its imbalance problem:
- first, you’re going to set up a model with balanced classes
- then, you will fit it to the training data
- finally, you will check its accuracy on the test set
The variables features_train
, target_train
, features_test
and target_test
are already available in your workspace.
Este exercício faz parte do curso
HR Analytics: Predicting Employee Churn in Python
Instruções do exercício
- Initialize the Decision Tree Classifier, prune your tree by limiting its maximum depth to 5, and balance the class weights.
- Fit the new model.
- Print the accuracy
score
of the prediction (in percentage points) for the test set.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Initialize the DecisionTreeClassifier
model_depth_5_b = DecisionTreeClassifier(____=5,class_weight="____",random_state=42)
# Fit the model
model_depth_5_b.____(features_train,target_train)
# Print the accuracy of the prediction (in percentage points) for the test set
print(model_depth_5_b.____(features_test,____)*100)