Predict churn with decision tree
Now you will build on the skills you acquired in the earlier exercise, and build a more complex decision tree with additional parameters to predict customer churn. You will dive deep into the churn prediction problem in the next chapter. Here you will run the decision tree classifier again on your training data, predict the churn rate on unseen (test) data, and assess model accuracy on both datasets.
The tree
module from the sklearn
library has been loaded for you, as well as the accuracy_score
function from sklearn.metrics
. The features and target variables have also been imported as train_X
, train_Y
for training data, and test_X
, test_Y
for test data.
Este exercício faz parte do curso
Machine Learning for Marketing in Python
Instruções do exercício
- Initialize a Decision tree with maximum depth set to 7 and by using the gini criterion.
- Fit the model to the training data.
- Predict the values on the test dataset.
- Print the accuracy values for both training and test datasets.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Initialize the Decision Tree
clf = tree.DecisionTreeClassifier(max_depth = ___,
criterion = 'gini',
splitter = 'best')
# Fit the model to the training data
clf = clf.___(train_X, train_Y)
# Predict the values on test dataset
pred_Y = clf.___(test_X)
# Print accuracy values
print("Training accuracy: ", np.round(clf.score(train_X, train_Y), 3))
print("Test accuracy: ", np.round(___(test_Y, pred_Y), 3))