Aan de slagGa gratis aan de slag

Visualizing classification model performance

In this exercise, you will be solving a classification problem where the "popularity" column in the music_df dataset has been converted to binary values, with 1 representing popularity more than or equal to the median for the "popularity" column, and 0 indicating popularity below the median.

Your task is to build and visualize the results of three different models to classify whether a song is popular or not.

The data has been split, scaled, and preloaded for you as X_train_scaled, X_test_scaled, y_train, and y_test. Additionally, KNeighborsClassifier, DecisionTreeClassifier, and LogisticRegression have been imported.

Deze oefening maakt deel uit van de cursus

Supervised Learning with scikit-learn

Cursus bekijken

Oefeninstructies

  • Create a dictionary of "Logistic Regression", "KNN", and "Decision Tree Classifier", setting the dictionary's values to a call of each model.
  • Loop through the values in models.
  • Instantiate a KFold object to perform 6 splits, setting shuffle to True and random_state to 12.
  • Perform cross-validation using the model, the scaled training features, the target training set, and setting cv equal to kf.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create models dictionary
models = {"____": ____(), "____": ____(), "____": ____()}
results = []

# Loop through the models' values
for model in ____.____():
  
  # Instantiate a KFold object
  kf = ____(n_splits=____, random_state=____, shuffle=____)
  
  # Perform cross-validation
  cv_results = ____(____, ____, ____, cv=____)
  results.append(cv_results)
plt.boxplot(results, labels=models.keys())
plt.show()
Code bewerken en uitvoeren