Get startedGet started for free

Visualization PCs with a scree plot

In a machine learning interview, you may be asked what is the optimum number of features to keep. In this exercise you'll create a scree plot and a cumulative explained variance ratio plot of the principal components using PCA on loan_data. This will help inform the optimal number of PCs for training a more accurate ML model going forward.

Since PCA is an unsupervised method, that means principal component analysis is performed on the X matrix having removed the target variable Loan Status from the dataset. Not setting n_components returns all the principal components from the trained model.

This exercise is part of the course

Practicing Machine Learning Interview Questions in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Remove target variable
X = loan_data.____('____', axis=1)

# Instantiate
pca = ____(n_components=____)

# Fit and transform
principalComponents = pca.____(____)
Edit and Run Code