Variance of the PCA features
The fish dataset is 6-dimensional. But what is its intrinsic dimension? Make a plot of the variances of the PCA features to find out. As before, samples
is a 2D array, where each row represents a fish. You'll need to standardize the features first.
This exercise is part of the course
Unsupervised Learning in Python
Exercise instructions
- Create an instance of
StandardScaler
calledscaler
. - Create a
PCA
instance calledpca
. - Use the
make_pipeline()
function to create a pipeline chainingscaler
andpca
. - Use the
.fit()
method ofpipeline
to fit it to the fish samplessamples
. - Extract the number of components used using the
.n_components_
attribute ofpca
. Place this inside arange()
function and store the result asfeatures
. - Use the
plt.bar()
function to plot the explained variances, withfeatures
on the x-axis andpca.explained_variance_
on the y-axis.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Perform the necessary imports
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
import matplotlib.pyplot as plt
# Create scaler: scaler
scaler = ____
# Create a PCA instance: pca
pca = ____
# Create pipeline: pipeline
pipeline = ____
# Fit the pipeline to 'samples'
____
# Plot the explained variances
features = ____
plt.bar(____, ____)
plt.xlabel('PCA feature')
plt.ylabel('variance')
plt.xticks(features)
plt.show()