Exercise

Calculate possible combinations

The healthcare_cat_df data frame contains categorical variables about employees in a healthcare company and whether they left the company or not. You will use this dataset to determine the number of combinations of the feature values that exist in the dataset.

When training a machine learning model, you would want your data to contain many observations of each combination. So, the number of combinations helps create a benchmark for the minimum number of observations you would need to collect to help avoid bias in your model.

The tidyverse package has been loaded for you.

Instructions

100 XP
  • Calculate the minimum number of observations needed to represent all combinations of the feature values in healthcare_cat_df.