Visualization PCs with a scree plot
In a machine learning interview, you may be asked what is the optimum number of features to keep. In this exercise you'll create a scree plot and a cumulative explained variance ratio plot of the principal components using PCA on loan_data
.
This will help inform the optimal number of PCs for training a more accurate ML model going forward.
Since PCA is an unsupervised method, that means principal component analysis is performed on the X
matrix having removed the target variable Loan Status
from the dataset. Not setting n_components
returns all the principal components from the trained model.
Este ejercicio forma parte del curso
Practicing Machine Learning Interview Questions in Python
Ejercicio interactivo práctico
Prueba este ejercicio completando el código de muestra.
# Remove target variable
X = loan_data.____('____', axis=1)
# Instantiate
pca = ____(n_components=____)
# Fit and transform
principalComponents = pca.____(____)