Get startedGet started for free

Build out an ALS model

Let's specify your first ALS model. Complete the code below to build your first ALS model.

Recall that you can use the .columns method on the ratings data frame to see what the names of the columns are that contain user, movie, and ratings data. Spark needs to know the names of these columns in order to perform ALS correctly.

This exercise is part of the course

Building Recommendation Engines with PySpark

View Course

Exercise instructions

  • Before building our ALS model, we need to split the data into training data and test data. Use the randomSplit() method to split the ratings dataframe into training_data and test_data using an 0.8/0.2 split respectively and a seed for the random number generator of 42.
  • Tell Spark which columns contain the userCol, itemCol and ratingCol. Use the .columns method if needed. Complete the hyperparameters. Set the rank to 10, the maxIter to 15, the regParam or lambda to .1, the coldStartStrategy to "drop", the nonnegative argument should be set to True, and since our data contains explicit ratings, set the implicitPrefs argument to False.
  • Now fit the als model to the training_data portion of the ratings data by calling the als.fit() method on the training_data provided. Call the fitted model model.
  • Generate predictions on the test_data portion of the ratings data by calling the model.transform() method on the test_data provided. Call the predictions test_predictions. Feel free to view the predictions by calling the .show() method on the test_predictions

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Split the ratings dataframe into training and test data
(training_data, test_data) = ratings.____([____, ____], seed=42)

# Set the ALS hyperparameters
from pyspark.ml.recommendation import ALS
als = ALS(userCol="____", itemCol="____", ratingCol="____", rank =____, maxIter =____, regParam =____,
          coldStartStrategy="____", nonnegative =____, implicitPrefs = ____)

# Fit the mdoel to the training_data
____ = ____.fit(____)

# Generate predictions on the test_data
____ = ____.transform(____)
test_predictions.show()
Edit and Run Code