Get startedGet started for free

Centering and scaling for classification

Now you will bring together scaling and model building into a pipeline for cross-validation.

Your task is to build a pipeline to scale features in the music_df dataset and perform grid search cross-validation using a logistic regression model with different values for the hyperparameter C. The target variable here is "genre", which contains binary values for rock as 1 and any other genre as 0.

StandardScaler, LogisticRegression, and GridSearchCV have all been imported for you.

This exercise is part of the course

Supervised Learning with scikit-learn

View Course

Exercise instructions

  • Build the steps for the pipeline: a StandardScaler() object named "scaler", and a logistic regression model named "logreg".
  • Create the parameters, searching 20 equally spaced float values ranging from 0.001 to 1.0 for the logistic regression model's C hyperparameter within the pipeline.
  • Instantiate the grid search object.
  • Fit the grid search object to the training data.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Build the steps
steps = [("____", ____()),
         ("____", ____())]
pipeline = Pipeline(steps)

# Create the parameter space
parameters = {"____": np.____(____, ____, 20)}
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 
                                                    random_state=21)

# Instantiate the grid search object
cv = ____(____, param_grid=____)

# Fit to the training data
cv.____(____, ____)
print(cv.best_score_, "\n", cv.best_params_)
Edit and Run Code