LoslegenKostenlos starten

Fit a decision tree

Random forests are a go-to model for predictions; they work well out of the box. But we'll first learn the building block of random forests -- decision trees.

Decision trees split the data into groups based on the features. Decision trees start with a root node, and split the data down until we reach leaf nodes.

decision tree

We can use sklearn to fit a decision tree with DecisionTreeRegressor and .fit(features, targets).

Without limiting the tree's depth (or height), it will keep splitting the data until each leaf has 1 sample in it, which is the epitome of overfitting. We'll learn more about overfitting in the coming chapters.

Diese Übung ist Teil des Kurses

<Kurs>Machine Learning for Finance in Python</Kurs>
Kurs ansehen

Übungsanweisungen

  • Use the imported class DecisionTreeRegressor with default arguments (i.e. no arguments) to create a decision tree model called decision_tree.
  • Fit the model using train_features and train_targets which we've created earlier (and now contain day-of-week and volume features).
  • Print the score on the training features and targets, as well as test_features and test_targets.

Interaktive praktische Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

from sklearn.tree import DecisionTreeRegressor

# Create a decision tree regression model with default arguments
decision_tree = ____

# Fit the model to the training features and targets
decision_tree.fit(____)

# Check the score on train and test
print(decision_tree.score(train_features, train_targets))
print(decision_tree.score(____))
Code bearbeiten und ausführen