CommencerCommencer gratuitement

Accuracy after dimensionality reduction

You'll reduce the overfit with the help of dimensionality reduction. In this case, you'll apply a rather drastic form of dimensionality reduction by only selecting a single column that has some good information to distinguish between genders. You'll repeat the train-test split, model fit and prediction steps to compare the accuracy on test versus training data.

All relevant packages and y have been pre-loaded.

Cet exercice fait partie du cours

Dimensionality Reduction in Python

Afficher le cours

Instructions

  • Select just the neck circumference ('neckcircumferencebase') column from ansur_df.
  • Split the data, instantiate a classifier and fit the data. This has been done for you.
  • Once again calculate the accuracy scores on both training and test set.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Assign just the 'neckcircumferencebase' column from ansur_df to X
X = ansur_df[[____]]

# Split the data, instantiate a classifier and fit the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
svc = SVC()
svc.fit(X_train, y_train)

# Calculate accuracy scores on both train and test data
accuracy_train = accuracy_score(____, svc.predict(____))
accuracy_test = accuracy_score(____, svc.predict(____))

print(f"{accuracy_test:.1%} accuracy on test set vs. {accuracy_train:.1%} on training set")
Modifier et exécuter le code