Evaluating Random Forest
In this final exercise you'll be evaluating the results of cross-validation on a Random Forest model.
The following have already been created:
cv- a cross-validator which has already been fit to the training dataevaluator— aBinaryClassificationEvaluatorobject andflights_test— the testing data.
Deze oefening maakt deel uit van de cursus
Machine Learning with PySpark
Oefeninstructies
- Print a list of average AUC metrics across all models in the parameter grid.
- Display the average AUC for the best model. This will be the largest AUC in the list.
- Print an explanation of the
maxDepthandfeatureSubsetStrategyparameters for the best model. - Display the AUC for the best model predictions on the testing data.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Average AUC for each parameter combination in grid
print(cv.____)
# Average AUC for the best model
print(____(____))
# What's the optimal parameter value for maxDepth?
print(cv.____.explainParam('____'))
# What's the optimal parameter value for featureSubsetStrategy?
print(cv.____.____(____))
# AUC for best model on testing data
print(evaluator.____(____.____(____)))