A forest of decision trees

For this exercise, you'll practice using the bootstrapped Decision Tree, otherwise known as the Random Forest. As you did in the previous exercise, you'll then compare its accuracy to a model where you've tuned hyperparameters with cross-validation.

This time, you'll tune an additional hyperparameter, max_features, which lets your model decide how many features to use. When it is not set specifically, then it defaults to auto. Something to keep in mind for an interview is that Decision Trees consider all features by default, whereas Random Forests usually consider the square root of the number of features.

The feature matrix X, target label y, and train_test_split from sklearn.model_selection have been imported for you.

1
- Import the correct function for a random forest classifier and split the data into train and test sets.
- Instantiate a random forest classifier, fit, predict, and print accuracy.

2
- Import the correct function to perform cross-validated grid search.
- Perform the same steps, this time while performing cross-validated grid-search.

Data Pre-processing and Visualization

Supervised Learning

Unsupervised Learning

Model Selection and Evaluation

Exercise

A forest of decision trees

Instructions 1/2