Dropping levels
The contingency table from the last exercise revealed that there are some levels that have very low counts. To simplify the analysis, it often helps to drop such levels.
In R, this requires two steps: first filtering out any rows with the levels that have very low counts, then removing these levels from the factor variable with droplevels()
. This is because the droplevels()
function would keep levels that have just 1 or 2 counts; it only drops levels that don't exist in a dataset.
Diese Übung ist Teil des Kurses
Exploratory Data Analysis in R
Anleitung zur Übung
The contingency table from the last exercise is available in your workspace as tab
.
- Load the
dplyr
package. - Print
tab
to find out which level ofalign
has the fewest total entries. - Use
filter()
to filter out all rows ofcomics
with that level, then drop the unused level withdroplevels()
. Save the simplified dataset ascomics_filtered
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Load dplyr
___
# Print tab
___
# Remove align level
comics_filtered <- ___ %>%
___(align != ___) %>%
___()
# See the result
comics_filtered