Identify optimal tree depth

Now you will tune the max_depth parameter of the decision tree to discover the one which reduces over-fitting while still maintaining good model performance metrics. You will run a for loop through multiple max_depth parameter values and fit a decision tree for each, and then calculate performance metrics.

The list called depth_list with the parameter candidates has been loaded for you. The depth_tuning array has been built for you with 2 columns, with the first one being filled with the depth candidates, and the next one being a placeholder for the recall score. Also, the features and target variables have been loaded as train_X, train_Y for the training data, and test_X, test_Y for the test data. Both numpy and pandas libraries are loaded as np and pd respectively.

Run a for loop over the range from 0 to the length of the list depth_list.
For each depth candidate, initialize and fit a decision tree classifier and predict churn on test data.
For each depth candidate, calculate the recall score by using the recall_score() function and store it in the second column of depth_tunning.
Create a pandas DataFrame out of depth_tuning with the appropriate column names.

Machine learning for marketing basics

Churn prediction and drivers

Customer Lifetime Value (CLV) prediction

Customer segmentation

Exercise

Identify optimal tree depth

Instructions