Train your first classification tree
In this exercise you'll work with the Wisconsin Breast Cancer Dataset from the UCI machine learning repository. You'll predict whether a tumor is malignant or benign based on two features: the mean radius of the tumor (radius_mean) and its mean number of concave points (concave points_mean).
The dataset is already loaded in your workspace and is split into 80% train and 20% test. The feature matrices are assigned to X_train and X_test, while the arrays of labels are assigned to y_train and y_test where class 1 corresponds to a malignant tumor and class 0 corresponds to a benign tumor. To obtain reproducible results, we also defined a variable called SEED which is set to 1.
This exercise is part of the course
Machine Learning with Tree-Based Models in Python
Exercise instructions
Import
DecisionTreeClassifierfromsklearn.tree.Instantiate a
DecisionTreeClassifierdtof maximum depth equal to 6.Fit
dtto the training set.Predict the test set labels and assign the result to
y_pred.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import DecisionTreeClassifier from sklearn.tree
from ____.____ import ____
# Instantiate a DecisionTreeClassifier 'dt' with a maximum depth of 6
dt = ____(____=____, random_state=SEED)
# Fit dt to the training set
____.____(____, ____)
# Predict test set labels
y_pred = ____.____(____)
print(y_pred[0:5])