LoslegenKostenlos loslegen

Dealing with uncommon categories

Some features can have many different categories but a very uneven distribution of their occurrences. Take for example Data Science's favorite languages to code in, some common choices are Python, R, and Julia, but there can be individuals with bespoke choices, like FORTRAN, C etc. In these cases, you may not want to create a feature for each value, but only the more common occurrences.

Diese Übung ist Teil des Kurses

Feature Engineering for Machine Learning in Python

Kurs anzeigen

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# Create a series out of the Country column
countries = so_survey_df.____

# Get the counts of each category
country_counts = countries.____

# Print the count values for each category
print(country_counts)
Code bearbeiten und ausführen