Filtering the dataset
Employees at senior levels such as Vice President
, Director
, Senior Manager
etc. have very different labor market conditions and are few in numbers too, hence, including them in your analysis can disproportionately affect your findings.
In this exercise, you will count the number of employees only at the Analyst
and Specialist
levels using the filter()
function.
The following example filters df
such that only the observations for which x is a
or b
or c
are selected:
df %>%
filter(x %in% c("a", "b", "c"))
This exercise is part of the course
HR Analytics: Predicting Employee Churn in R
Exercise instructions
- First, count the number of employees across levels.
- Subset the data to retain employees only at the
Analyst
andSpecialist
levels. - Review the number of employees across all levels, again.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Count the number of employees across levels
org %>%
___(level)
# Select the employees at Analyst and Specialist level
org2 <- org %>%
___(level ___)
# Validate the results
org2 %>%
count(level)