Get Started

Visualizing classification model performance

In this exercise, you will be solving a classification problem where the "popularity" column in the music_df dataset has been converted to binary values, with 1 representing popularity more than or equal to the median for the "popularity" column, and 0 indicating popularity below the median.

Your task is to build and visualize the results of three different models to classify whether a song is popular or not.

The data has been split, scaled, and preloaded for you as X_train_scaled, X_test_scaled, y_train, and y_test. Additionally, KNeighborsClassifier, DecisionTreeClassifier, and LogisticRegression have been imported.

This is a part of the course

“Supervised Learning with scikit-learn”

View Course

Exercise instructions

  • Create a dictionary of "Logistic Regression", "KNN", and "Decision Tree Classifier", setting the dictionary's values to a call of each model.
  • Loop through the values in models.
  • Instantiate a KFold object to perform 6 splits, setting shuffle to True and random_state to 12.
  • Perform cross-validation using the model, the scaled training features, the target training set, and setting cv equal to kf.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create models dictionary
models = {"____": ____(), "____": ____(), "____": ____()}
results = []

# Loop through the models' values
for model in ____.____():
  
  # Instantiate a KFold object
  kf = ____(n_splits=____, random_state=____, shuffle=____)
  
  # Perform cross-validation
  cv_results = ____(____, ____, ____, cv=____)
  results.append(cv_results)
plt.boxplot(results, labels=models.keys())
plt.show()
Edit and Run Code