Pipeline for predicting song popularity
For the final exercise, you will build a pipeline to impute missing values, scale features, and perform hyperparameter tuning of a logistic regression model. The aim is to find the best parameters and accuracy when predicting song genre!
All the models and objects required to build the pipeline have been preloaded for you.
Questo esercizio fa parte del corso
Supervised Learning with scikit-learn
Istruzioni dell'esercizio
- Create the steps for the pipeline by calling a simple imputer, a standard scaler, and a logistic regression model.
- Create a pipeline object, and pass the
steps
variable. - Instantiate a grid search object to perform cross-validation using the pipeline and the parameters.
- Print the best parameters and compute and print the test set accuracy score for the grid search object.
Esercizio pratico interattivo
Prova questo esercizio completando il codice di esempio.
# Create steps
steps = [("imp_mean", ____()),
("scaler", ____()),
("logreg", ____())]
# Set up pipeline
pipeline = ____(____)
params = {"logreg__solver": ["newton-cg", "saga", "lbfgs"],
"logreg__C": np.linspace(0.001, 1.0, 10)}
# Create the GridSearchCV object
tuning = ____(____, param_grid=____)
tuning.fit(X_train, y_train)
y_pred = tuning.predict(X_test)
# Compute and print performance
print("Tuned Logistic Regression Parameters: {}, Accuracy: {}".format(____.____, ____.____))