Visualization PCs with a scree plot
In a machine learning interview, you may be asked what is the optimum number of features to keep. In this exercise you'll create a scree plot and a cumulative explained variance ratio plot of the principal components using PCA on loan_data
.
This will help inform the optimal number of PCs for training a more accurate ML model going forward.
Since PCA is an unsupervised method, that means principal component analysis is performed on the X
matrix having removed the target variable Loan Status
from the dataset. Not setting n_components
returns all the principal components from the trained model.
Diese Übung ist Teil des Kurses
Practicing Machine Learning Interview Questions in Python
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Remove target variable
X = loan_data.____('____', axis=1)
# Instantiate
pca = ____(n_components=____)
# Fit and transform
principalComponents = pca.____(____)