Exercise

Accuracy after dimensionality reduction

You'll reduce the overfit with the help of dimensionality reduction. In this case, you'll apply a rather drastic form of dimensionality reduction by only selecting a single column that has some good information to distinguish between genders. You'll repeat the train-test split, model fit and prediction steps to compare the accuracy on test versus training data.

All relevant packages and y have been pre-loaded.

Instructions

100 XP
  • Select just the neck circumference ('neckcircumferencebase') column from ansur_df.
  • Split the data, instantiate a classifier and fit the data. This has been done for you.
  • Once again calculate the accuracy scores on both training and test set.