Best Model and Best Model Parameters
Now that we have our cross validator, cv
, built out, we can tell Spark to take our data, fit the ALS algorithm to it, and try the different combinations of hyperparameter values from our param_grid
so that it can identify what values provide the smallest RMSE. Unfortunately, this takes too long to complete here, but for your reference, this is how it is done:
#Fit cross validator to the 'train' dataset
model = cv.fit(train)
#Extract best model from the cv model above
best_model = model.bestModel
This code has been run separately, and the best_model
has been identified and saved for you to use. Use the commands given to extract the parameters of the model.
This exercise is part of the course
Building Recommendation Engines with PySpark
Exercise instructions
- Print
type(best_model)
to confirm that the model ALS built from our hyperparameter options is indeed completed. A print statement is needed here in order to work with subsequent print statements. - Extract the
rank
from thebest_model
by calling the.getRank()
method on thebest_model
. - Extract the
maxIter
from thebest_model
by calling the.getMaxIter()
method on thebest_model
. - Extract the
regParam
from thebest_model
by calling the.getRegParam()
method on thebest_model
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print best_model
print(____)
# Complete the code below to extract the ALS model parameters
print("**Best Model**")
# Print "Rank"
print(" Rank:", best_model.get____())
# Print "MaxIter"
print(" MaxIter:", best_model.get____())
# Print "RegParam"
print(" RegParam:", best_model.get____())