Get startedGet started for free

Calculate possible combinations

The healthcare_cat_df data frame contains categorical variables about employees in a healthcare company and whether they left the company or not. You will use this dataset to determine the number of combinations of the feature values that exist in the dataset.

When training a machine learning model, you would want your data to contain many observations of each combination. So, the number of combinations helps create a benchmark for the minimum number of observations you would need to collect to help avoid bias in your model.

The tidyverse package has been loaded for you.

This exercise is part of the course

Dimensionality Reduction in R

View Course

Exercise instructions

  • Calculate the minimum number of observations needed to represent all combinations of the feature values in healthcare_cat_df.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Calculate the minimum number of value combinations
healthcare_cat_df %>% 
  ___(___(___(), ~ ___(unique(.)))) %>% 
  ___()
Edit and Run Code