PCA in a model pipeline
We just saw that legendary Pokemon tend to have higher stats overall. Let's see if we can add a classifier to our pipeline that detects legendary versus non-legendary Pokemon based on the principal components.
The data has been pre-loaded for you and split into training and tests datasets: X_train
, X_test
, y_train
, y_test
.
Same goes for all relevant packages and classes(Pipeline()
, StandardScaler()
, PCA()
, RandomForestClassifier()
).
This exercise is part of the course
Dimensionality Reduction in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build the pipeline
pipe = Pipeline([
('scaler', ____),
('reducer', ____),
('classifier', ____)])