Corroborate the splits
In the previous exercise, you split the dataset into train_set
and test_set
. It's important to make sure that the data you are training your model is representative of the test set. So let's make sure both train_set
and test_set
have the same proportion of active and inactive employees.
This exercise is part of the course
HR Analytics: Predicting Employee Churn in R
Exercise instructions
Calculate the proportion of Active
and Inactive
employees in train_set
and test_set
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Calculate turnover proportion in train_set
train_set %>%
___(status) %>%
___(prop = n / sum(n))
# Calculate turnover proportion in test_set
test_set %>%
___(status) %>%
___(prop = n / sum(n))