ComeçarComece de graça

Accuracy after dimensionality reduction

You'll reduce the overfit with the help of dimensionality reduction. In this case, you'll apply a rather drastic form of dimensionality reduction by only selecting a single column that has some good information to distinguish between genders. You'll repeat the train-test split, model fit and prediction steps to compare the accuracy on test versus training data.

All relevant packages and y have been pre-loaded.

Este exercício faz parte do curso

Dimensionality Reduction in Python

Ver curso

Instruções do exercício

  • Select just the neck circumference ('neckcircumferencebase') column from ansur_df.
  • Split the data, instantiate a classifier and fit the data. This has been done for you.
  • Once again calculate the accuracy scores on both training and test set.

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Assign just the 'neckcircumferencebase' column from ansur_df to X
X = ansur_df[[____]]

# Split the data, instantiate a classifier and fit the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
svc = SVC()
svc.fit(X_train, y_train)

# Calculate accuracy scores on both train and test data
accuracy_train = accuracy_score(____, svc.predict(____))
accuracy_test = accuracy_score(____, svc.predict(____))

print(f"{accuracy_test:.1%} accuracy on test set vs. {accuracy_train:.1%} on training set")
Editar e executar o código