A forest of decision trees
For this exercise, you'll practice using the bootstrapped Decision Tree, otherwise known as the Random Forest. As you did in the previous exercise, you'll then compare its accuracy to a model where you've tuned hyperparameters with cross-validation.
This time, you'll tune an additional hyperparameter, max_features
, which lets your model decide how many features to use. When it is not set specifically, then it defaults to auto
. Something to keep in mind for an interview is that Decision Trees consider all features by default, whereas Random Forests usually consider the square root of the number of features.
The feature matrix X
, target label y
, and train_test_split
from sklearn.model_selection
have been imported for you.
This exercise is part of the course
Practicing Machine Learning Interview Questions in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import modules
from sklearn.ensemble import ____
from sklearn.metrics import accuracy_score
# Train/test split
X_train, X_test, y_train, y_test = train_test_split(____, ____, test_size=0.30, random_state=123)
# Instantiate, Fit, Predict
loans_rf = ____()
loans_rf.____(____, ____)
y_pred = loans_rf.____(____)
# Evaluation metric
print("Random Forest Accuracy: {}".format(____(____,____)))