Selecting the proportion of variance to keep
You'll let PCA determine the number of components to calculate based on an explained variance threshold that you decide.
You'll work on the numeric ANSUR female dataset pre-loaded as ansur_df
.
All relevant packages and classes have been pre-loaded too (Pipeline()
, StandardScaler()
, PCA()
).
This exercise is part of the course
Dimensionality Reduction in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Pipe a scaler to PCA selecting 80% of the variance
pipe = ____([('scaler', ____),
('reducer', ____)])