IniziaInizia gratis

Evaluating Random Forest

In this final exercise you'll be evaluating the results of cross-validation on a Random Forest model.

The following have already been created:

  • cv - a cross-validator which has already been fit to the training data
  • evaluator — a BinaryClassificationEvaluator object and
  • flights_test — the testing data.

Questo esercizio fa parte del corso

Machine Learning with PySpark

Visualizza il corso

Istruzioni dell'esercizio

  • Print a list of average AUC metrics across all models in the parameter grid.
  • Display the average AUC for the best model. This will be the largest AUC in the list.
  • Print an explanation of the maxDepth and featureSubsetStrategy parameters for the best model.
  • Display the AUC for the best model predictions on the testing data.

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# Average AUC for each parameter combination in grid
print(cv.____)

# Average AUC for the best model
print(____(____))

# What's the optimal parameter value for maxDepth?
print(cv.____.explainParam('____'))
# What's the optimal parameter value for featureSubsetStrategy?
print(cv.____.____(____))

# AUC for best model on testing data
print(evaluator.____(____.____(____)))
Modifica ed esegui il codice