Get startedGet started for free

Train your first classification tree

In this exercise you'll work with the Wisconsin Breast Cancer Dataset from the UCI machine learning repository. You'll predict whether a tumor is malignant or benign based on two features: the mean radius of the tumor (radius_mean) and its mean number of concave points (concave points_mean).

The dataset is already loaded in your workspace and is split into 80% train and 20% test. The feature matrices are assigned to X_train and X_test, while the arrays of labels are assigned to y_train and y_test where class 1 corresponds to a malignant tumor and class 0 corresponds to a benign tumor. To obtain reproducible results, we also defined a variable called SEED which is set to 1.

This exercise is part of the course

Machine Learning with Tree-Based Models in Python

View Course

Exercise instructions

  • Import DecisionTreeClassifier from sklearn.tree.

  • Instantiate a DecisionTreeClassifier dt of maximum depth equal to 6.

  • Fit dt to the training set.

  • Predict the test set labels and assign the result to y_pred.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import DecisionTreeClassifier from sklearn.tree
from ____.____ import ____

# Instantiate a DecisionTreeClassifier 'dt' with a maximum depth of 6
dt = ____(____=____, random_state=SEED)

# Fit dt to the training set
____.____(____, ____)

# Predict test set labels
y_pred = ____.____(____)
print(y_pred[0:5])
Edit and Run Code