Get startedGet started for free

Pipeline for predicting song popularity

For the final exercise, you will build a pipeline to impute missing values, scale features, and perform hyperparameter tuning of a logistic regression model. The aim is to find the best parameters and accuracy when predicting song genre!

All the models and objects required to build the pipeline have been preloaded for you.

This exercise is part of the course

Supervised Learning with scikit-learn

View Course

Exercise instructions

  • Create the steps for the pipeline by calling a simple imputer, a standard scaler, and a logistic regression model.
  • Create a pipeline object, and pass the steps variable.
  • Instantiate a grid search object to perform cross-validation using the pipeline and the parameters.
  • Print the best parameters and compute and print the test set accuracy score for the grid search object.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create steps
steps = [("imp_mean", ____()), 
         ("scaler", ____()), 
         ("logreg", ____())]

# Set up pipeline
pipeline = ____(____)
params = {"logreg__solver": ["newton-cg", "saga", "lbfgs"],
         "logreg__C": np.linspace(0.001, 1.0, 10)}

# Create the GridSearchCV object
tuning = ____(____, param_grid=____)
tuning.fit(X_train, y_train)
y_pred = tuning.predict(X_test)

# Compute and print performance
print("Tuned Logistic Regression Parameters: {}, Accuracy: {}".format(____.____, ____.____))
Edit and Run Code