Optimizing flights linear regression
Up until now you've been using the default hyper-parameters when building your models. In this exercise you'll use cross validation to choose an optimal (or close to optimal) set of model hyper-parameters.
The following have already been created:
regression— aLinearRegressionobjectpipeline— a pipeline with string indexer, one-hot encoder, vector assembler and linear regression andevaluator— aRegressionEvaluatorobject.
Deze oefening maakt deel uit van de cursus
Machine Learning with PySpark
Oefeninstructies
- Create a parameter grid builder.
- Add grids for with
regression.regParam(values 0.01, 0.1, 1.0, and 10.0) andregression.elasticNetParam(values 0.0, 0.5, and 1.0). - Build the grid.
- Create a cross validator, specifying five folds.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Create parameter grid
params = ____()
# Add grids for two parameters
params = params.____(____, ____) \
.____(____, ____)
# Build the parameter grid
params = params.____()
print('Number of models to be tested: ', len(params))
# Create cross-validator
cv = ____(estimator=____, estimatorParamMaps=____, evaluator=____, ____)