Get startedGet started for free

Generate predictions and calculate RMSE

Now that we have a model that is trained on our data and tuned through cross validation, we can see how it performs on the test dataframe. To do this, we'll calculate the RMSE.

As a side note, the generation of test predictions takes more than a few minutes with this dataset. For this reason, the test predictions have been generated already and are provided here as a dataframe called test_predictions. For your reference, they are generated using this code: test_predictions = best_model.transform(test).

This exercise is part of the course

Building Recommendation Engines with PySpark

View Course

Exercise instructions

  • The dataframe test_predictions contains predictions that our cross-validated ALS model generated using the test set that we created previously. Use the .show() method to take a look at it and see if the predictions seem close.
  • Use the evaluator that you built previously to calculate the RMSE by calling the .evaluate() method on the test_predictions generated. Call this RMSE.
  • Print the RMSE.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# View the predictions 
test_predictions.____()

# Calculate and print the RMSE of test_predictions
RMSE = evaluator.____(____)
print(____)
Edit and Run Code