Decision tree

In the last three chapters, you've learned a range of techniques that help you tackle many aspects of the machine learning interview. In this chapter, you'll be introduced to various ways to make sure any model you're asked to create or discuss in a machine learning interview is generalizable, evaluated correctly, and properly selected from among other possible models.

In this exercise, you will delve into hyperparameter tuning for a decision tree on the loan_data dataset. Here you'll tune min_samples_split, which is the minimum number of samples required to create an additional binary split, and max_depth, which is how deep you want to grow the tree. The deeper a tree, the more splits and therefore captures more information about the data.

The feature matrix X and the target label y have been imported for you.

Note that you're once again performing all of the steps in the machine learning pipeline!

Machine learning pipeline

1
- Import the correct function for a decision tree classifier and split the data into train and test sets.
- Instantiate a decision tree classifier, fit, predict, and print accuracy.

2
- Import the correct function to perform cross-validated grid search.
- Instantiate a decision tree classifier and use it with the parameter grid to perform a cross-validated grid-search.
- Fit and print model evaluation metrics

Data Pre-processing and Visualization

Supervised Learning

Unsupervised Learning

Model Selection and Evaluation

Exercise

Decision tree

Instructions 1/2