Get startedGet started for free

Make a grid

Next, you need to create a grid of values to search over when looking for the optimal hyperparameters. The submodule pyspark.ml.tuning includes a class called ParamGridBuilder that does just that (maybe you're starting to notice a pattern here; PySpark has a submodule for just about everything!).

You'll need to use the .addGrid() and .build() methods to create a grid that you can use for cross validation. The .addGrid() method takes a model parameter (an attribute of the model Estimator, lr, that you created a few exercises ago) and a list of values that you want to try. The .build() method takes no arguments, it just returns the grid that you'll use later.

This exercise is part of the course

Foundations of PySpark

View Course

Exercise instructions

  • Import the submodule pyspark.ml.tuning under the alias tune.
  • Call the class constructor ParamGridBuilder() with no arguments. Save this as grid.
  • Call the .addGrid() method on grid with lr.regParam as the first argument and np.arange(0, .1, .01) as the second argument. This second call is a function from the numpy module (imported as np) that creates a list of numbers from 0 to .1, incrementing by .01. Overwrite grid with the result.
  • Update grid again by calling the .addGrid() method a second time create a grid for lr.elasticNetParam that includes only the values [0, 1].
  • Call the .build() method on grid and overwrite it with the output.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import the tuning submodule
import ____ as ____

# Create the parameter grid
grid = tune.____

# Add the hyperparameter
grid = grid.addGrid(____, np.arange(0, .1, .01))
grid = grid.addGrid(____, ____)

# Build the grid
grid = grid.build()
Edit and Run Code