Calculate possible combinations
The healthcare_cat_df
data frame contains categorical variables about employees in a healthcare company and whether they left the company or not. You will use this dataset to determine the number of combinations of the feature values that exist in the dataset.
When training a machine learning model, you would want your data to contain many observations of each combination. So, the number of combinations helps create a benchmark for the minimum number of observations you would need to collect to help avoid bias in your model.
The tidyverse
package has been loaded for you.
Este exercício faz parte do curso
Dimensionality Reduction in R
Instruções do exercício
- Calculate the minimum number of observations needed to represent all combinations of the feature values in
healthcare_cat_df
.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Calculate the minimum number of value combinations
healthcare_cat_df %>%
___(___(___(), ~ ___(unique(.)))) %>%
___()