ComenzarEmpieza gratis

Create test/train splits and build your ALS model

You already know how to build an ALS model, having done it in the previous chapter. We will do that again, but we'll take some additional steps to fully build out a cross-validated model.

First, let's import the requisite functions and create our train and test data sets in preparation for the cross validation step.

Este ejercicio forma parte del curso

Building Recommendation Engines with PySpark

Ver curso

Instrucciones del ejercicio

  • Import the RegressionEvaluator from ml.evaluation, the ALS algorithm from ml.recommendation, and the ParamGridBuilder and the CrossValidator from ml.tuning.
  • Create an .80/.20 train/test split on the ratings data using the randomSplit method. Name the datasets train and test, and set the random seed to 1234.
  • Build out the ALS model, telling Spark the names of the columns in the ratings dataframe that correspond to the userCol, itemCol and ratingCol. Set the nonnegative argument to True, the coldStartStrategy to "drop" and let Spark know that these are not implicitPrefs by setting the implicitPrefs argument to False. Call this model als.
  • Verify that the model was created by calling the type() function on als. The output should indicate what type of model it is.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Import the required functions
from pyspark.ml.evaluation import ____
from pyspark.ml.recommendation import ____
from pyspark.ml.tuning import ____, ____

# Create test and train set
(train, test) = ratings.___([0.____, 0.____], seed = ____)

# Create ALS model
als = ALS(userCol="____", itemCol="____", ratingCol="____", nonnegative = ____, implicitPrefs = ____)

# Confirm that a model called "als" was created
type(____)
Editar y ejecutar código