Aan de slagGa gratis aan de slag

Train your first classification tree

In this exercise you'll work with the Wisconsin Breast Cancer Dataset from the UCI machine learning repository. You'll predict whether a tumor is malignant or benign based on two features: the mean radius of the tumor (radius_mean) and its mean number of concave points (concave points_mean).

The dataset is already loaded in your workspace and is split into 80% train and 20% test. The feature matrices are assigned to X_train and X_test, while the arrays of labels are assigned to y_train and y_test where class 1 corresponds to a malignant tumor and class 0 corresponds to a benign tumor. To obtain reproducible results, we also defined a variable called SEED which is set to 1.

Deze oefening maakt deel uit van de cursus

Machine Learning with Tree-Based Models in Python

Cursus bekijken

Oefeninstructies

  • Import DecisionTreeClassifier from sklearn.tree.

  • Instantiate a DecisionTreeClassifier dt of maximum depth equal to 6.

  • Fit dt to the training set.

  • Predict the test set labels and assign the result to y_pred.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Import DecisionTreeClassifier from sklearn.tree
from ____.____ import ____

# Instantiate a DecisionTreeClassifier 'dt' with a maximum depth of 6
dt = ____(____=____, random_state=SEED)

# Fit dt to the training set
____.____(____, ____)

# Predict test set labels
y_pred = ____.____(____)
print(y_pred[0:5])
Code bewerken en uitvoeren