Calculate possible combinations
The healthcare_cat_df
data frame contains categorical variables about employees in a healthcare company and whether they left the company or not. You will use this dataset to determine the number of combinations of the feature values that exist in the dataset.
When training a machine learning model, you would want your data to contain many observations of each combination. So, the number of combinations helps create a benchmark for the minimum number of observations you would need to collect to help avoid bias in your model.
The tidyverse
package has been loaded for you.
Este ejercicio forma parte del curso
Dimensionality Reduction in R
Instrucciones del ejercicio
- Calculate the minimum number of observations needed to represent all combinations of the feature values in
healthcare_cat_df
.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Calculate the minimum number of value combinations
healthcare_cat_df %>%
___(___(___(), ~ ___(unique(.)))) %>%
___()