1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Learning with Tree-Based Models in Python

Connected

Exercise

Train your first classification tree

In this exercise you'll work with the Wisconsin Breast Cancer Dataset from the UCI machine learning repository. You'll predict whether a tumor is malignant or benign based on two features: the mean radius of the tumor (radius_mean) and its mean number of concave points (concave points_mean).

The dataset is already loaded in your workspace and is split into 80% train and 20% test. The feature matrices are assigned to X_train and X_test, while the arrays of labels are assigned to y_train and y_test where class 1 corresponds to a malignant tumor and class 0 corresponds to a benign tumor. To obtain reproducible results, we also defined a variable called SEED which is set to 1.

Instructions

100 XP
  • Import DecisionTreeClassifier from sklearn.tree.

  • Instantiate a DecisionTreeClassifier dt of maximum depth equal to 6.

  • Fit dt to the training set.

  • Predict the test set labels and assign the result to y_pred.