MulaiMulai sekarang secara gratis

Delayed flights with Gradient-Boosted Trees

You've previously built a classifier for flights likely to be delayed using a Decision Tree. In this exercise you'll compare a Decision Tree model to a Gradient-Boosted Trees model.

The flights data have been randomly split into flights_train and flights_test.

Latihan ini adalah bagian dari kursus

Machine Learning with PySpark

Lihat Kursus

Petunjuk latihan

  • Import the classes required to create Decision Tree and Gradient-Boosted Tree classifiers.
  • Create Decision Tree and Gradient-Boosted Tree classifiers. Train on the training data.
  • Create an evaluator and calculate AUC on testing data for both classifiers. Which model performs better?
  • For the Gradient-Boosted Tree classifier print the number of trees and the relative importance of features.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Import the classes required
from pyspark.ml.____ import ____, ____
from pyspark.ml.evaluation import BinaryClassificationEvaluator

# Create model objects and train on training data
tree = ____().____(____)
gbt = ____().____(____)

# Compare AUC on testing data
evaluator = ____()
print(evaluator.____(tree.____(____)))
print(evaluator.____(gbt.____(____)))

# Find the number of trees and the relative importance of features
print(gbt.____)
print(gbt.____)
Edit dan Jalankan Kode