1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Learning with Tree-Based Models in Python

Exercise

Train your first classification tree

In this exercise you'll work with the Wisconsin Breast Cancer Dataset from the UCI machine learning repository. You'll predict whether a tumor is malignant or benign based on two features: the mean radius of the tumor (radius_mean) and its mean number of concave points (concave points_mean).

The dataset is already loaded in your workspace and is split into 80% train and 20% test. The feature matrices are assigned to X_train and X_test, while the arrays of labels are assigned to y_train and y_test where class 1 corresponds to a malignant tumor and class 0 corresponds to a benign tumor. To obtain reproducible results, we also defined a variable called SEED which is set to 1.

Instructions

100 XP
  • Import DecisionTreeClassifier from sklearn.tree.

  • Instantiate a DecisionTreeClassifier dt of maximum depth equal to 6.

  • Fit dt to the training set.

  • Predict the test set labels and assign the result to y_pred.