Visualization separation of classes with PCA I
A common question you may receive in a machine learning interview is visualizing dimensionality after PCA. In this exercise, you will do just that by plotting the first 2 principal components of loan_data
in order to visualize the class separation between both components on whether the loan status has been fully paid or charged off.
The loan_data
dataset has been scaled and one-hot encoded, meaning categorical variables were turned into binary indicators, since features should be on the same scale as well as numeric prior to PCA.
A PCA model with 2 PCs and setting up a plot with x and y labels and title has already been taken care of for you. You'll use a DataFrame called loan_data_PCA
in the exercises. The possible values for the target variable Loan Status
are 0
and 1
. You'll be plotting PC1 on the x-axis and PC2 on the y-axis.
Already imported for you are matplotlib.pyplot
as plt
, seaborn
as sns
, PCA
from sklearn.decomposition
.
This exercise is part of the course
Practicing Machine Learning Interview Questions in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
targets = [____, ____]
colors = ['r', 'b']
# For loop to create plot
for target, color in zip(____, ____):
indicesToKeep = ____['____'] == ____