Evaluating Random Forest

In this final exercise you'll be evaluating the results of cross-validation on a Random Forest model.

The following have already been created:

cv - a cross-validator which has already been fit to the training data
evaluator — a BinaryClassificationEvaluator object and
flights_test — the testing data.

Questo esercizio fa parte del corso

Machine Learning with PySpark

Visualizza il corso

Istruzioni dell'esercizio

Print a list of average AUC metrics across all models in the parameter grid.
Display the average AUC for the best model. This will be the largest AUC in the list.
Print an explanation of the maxDepth and featureSubsetStrategy parameters for the best model.
Display the AUC for the best model predictions on the testing data.

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# Average AUC for each parameter combination in grid
print(cv.____)

# Average AUC for the best model
print(____(____))

# What's the optimal parameter value for maxDepth?
print(cv.____.explainParam('____'))
# What's the optimal parameter value for featureSubsetStrategy?
print(cv.____.____(____))

# AUC for best model on testing data
print(evaluator.____(____.____(____)))

Modifica ed esegui il codice