PCA in a model pipeline
We just saw that legendary Pokemon tend to have higher stats overall. Let's see if we can add a classifier to our pipeline that detects legendary versus non-legendary Pokemon based on the principal components.
The data has been pre-loaded for you and split into training and tests datasets: X_train, X_test, y_train, y_test.
Same goes for all relevant packages and classes(Pipeline(), StandardScaler(), PCA(), RandomForestClassifier()).
This exercise is part of the course
Dimensionality Reduction in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build the pipeline
pipe = Pipeline([
('scaler', ____),
('reducer', ____),
('classifier', ____)])