CommencerCommencer gratuitement

Selecting relevant features

In this exercise, you'll identify the redundant columns in the volunteer dataset, and perform feature selection on the dataset to return a DataFrame of the relevant features.

For example, if you explore the volunteer dataset in the console, you'll see three features which are related to location: locality, region, and postalcode. They contain related information, so it would make sense to keep only one of the features.

Take some time to examine the features of volunteer in the console, and try to identify the redundant features.

Cet exercice fait partie du cours

Preprocessing for Machine Learning in Python

Afficher le cours

Instructions

  • Create a list of redundant column names and store it in the to_drop variable:
    • Out of all the location-related features, keep only postalcode.
    • Features that have gone through the feature engineering process are redundant as well.
  • Drop the columns in the to_drop list from the dataset.
  • Print out the .head() of volunteer_subset to see the selected columns.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create a list of redundant column names to drop
to_drop = ["____", "____", "____", "____", "____"]

# Drop those columns from the dataset
volunteer_subset = ____.____(____, ____)

# Print out the head of volunteer_subset
print(____)
Modifier et exécuter le code