Create the evaluator
The first thing you need when doing cross validation for model selection is a way to compare different models. Luckily, the pyspark.ml.evaluation
submodule has classes for evaluating different kinds of models. Your model is a binary classification model, so you'll be using the BinaryClassificationEvaluator
from the pyspark.ml.evaluation
module.
This evaluator calculates the area under the ROC. This is a metric that combines the two kinds of errors a binary classifier can make (false positives and false negatives) into a simple number. You'll learn more about this towards the end of the chapter!
This exercise is part of the course
Foundations of PySpark
Exercise instructions
- Import the submodule
pyspark.ml.evaluation
asevals
. - Create
evaluator
by callingevals.BinaryClassificationEvaluator()
with the argumentmetricName="areaUnderROC"
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import the evaluation submodule
import ____ as evals
# Create a BinaryClassificationEvaluator
evaluator = ____