Get startedGet started for free

Using PCA

In this exercise, you'll apply PCA to the wine dataset, to see if you can increase the model's accuracy.

This exercise is part of the course

Preprocessing for Machine Learning in Python

View Course

Exercise instructions

  • Instantiate a PCA object.
  • Define the features (X) and labels (y) from wine, using the labels in the "Type" column.
  • Apply PCA to X_train and X_test, ensuring no data leakage, and store the transformed values as pca_X_train and pca_X_test.
  • Print out the .explained_variance_ratio_ attribute of pca to check how much variance is explained by each component.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Instantiate a PCA object
pca = ____()

# Define the features and labels from the wine dataset
X = wine.drop(____, ____)
y = wine["Type"]

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42)

# Apply PCA to the wine dataset X vector
pca_X_train = ___.____(____)
pca_X_test = ___.____(____)

# Look at the percentage of variance explained by the different components
print(____)
Edit and Run Code