Build RMSE evaluator
Now that you know how to fit a model to training data and generate test predictions, you need a way to evaluate how well your model performs. For this we'll build an evaluator
. Evaluators in Spark can be built out in various ways. For our purposes, we want a regressionEvaluator
that calculates the RMSE. After we build our regressionEvaluator
, we can fit the model to our data and generate predictions.
This exercise is part of the course
Building Recommendation Engines with PySpark
Exercise instructions
- Import the required
RegressionEvaluator
package from thepyspark.ml.evaluation
class. - Complete the
evaluator
code, specifying themetric name
to be"rmse"
. Set thelabelCol
to the name of the column in ourratings
data that contains our ratings (use theratings.columns
method to see column names) and set theprediction
column name to"prediction"
. - Confirm that the
evaluator
was properly created by extracting each of the three parameters from it. Do this by running the following 3 lines of code, each within a print statement:evaluator.getMetricName()
evaluator.getLabelCol()
evaluator.getPredictionCol()
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import RegressionEvaluator
from pyspark.ml.evaluation import ____
# Complete the evaluator code
evaluator = RegressionEvaluator(metricName="____", labelCol="____", predictionCol="____")
# Extract the 3 parameters
print(evaluator.get____())
print(evaluator.get____())
print(evaluator.get____())