Build a Decision Tree
Now that you've split the flights data into training and testing sets, you can use the training set to fit a Decision Tree model.
The data are available as flights_train
and flights_test
.
NOTE: It will take a few seconds for the model to train… please be patient!
This exercise is part of the course
Machine Learning with PySpark
Exercise instructions
- Import the class for creating a Decision Tree classifier.
- Create a classifier object and fit it to the training data.
- Make predictions for the testing data and take a look at the predictions.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import the Decision Tree Classifier class
from pyspark.ml.____ import ____
# Create a classifier object and fit to the training data
tree = ____()
tree_model = tree.____(____)
# Create predictions for the testing data and take a look at the predictions
prediction = tree_model.____(____)
prediction.select('label', 'prediction', 'probability').show(5, False)