Generate predictions and calculate RMSE
Now that we have a model that is trained on our data and tuned through cross validation, we can see how it performs on the test dataframe. To do this, we'll calculate the RMSE.
As a side note, the generation of test predictions takes more than a few minutes with this dataset. For this reason, the test predictions have been generated already and are provided here as a dataframe called test_predictions. For your reference, they are generated using this code: test_predictions = best_model.transform(test).
Cet exercice fait partie du cours
Building Recommendation Engines with PySpark
Instructions
- The dataframe
test_predictionscontains predictions that our cross-validated ALS model generated using thetestset that we created previously. Use the.show()method to take a look at it and see if the predictions seem close. - Use the
evaluatorthat you built previously to calculate theRMSEby calling the.evaluate()method on thetest_predictionsgenerated. Call thisRMSE. - Print the
RMSE.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# View the predictions
test_predictions.____()
# Calculate and print the RMSE of test_predictions
RMSE = evaluator.____(____)
print(____)