1. Learn
  2. /
  3. Courses
  4. /
  5. Practicing Machine Learning Interview Questions in Python

Exercise

Visualization separation of classes with PCA I

A common question you may receive in a machine learning interview is visualizing dimensionality after PCA. In this exercise, you will do just that by plotting the first 2 principal components of loan_data in order to visualize the class separation between both components on whether the loan status has been fully paid or charged off.

The loan_data dataset has been scaled and one-hot encoded, meaning categorical variables were turned into binary indicators, since features should be on the same scale as well as numeric prior to PCA.

A PCA model with 2 PCs and setting up a plot with x and y labels and title has already been taken care of for you. You'll use a DataFrame called loan_data_PCA in the exercises. The possible values for the target variable Loan Status are 0 and 1. You'll be plotting PC1 on the x-axis and PC2 on the y-axis.

Already imported for you are matplotlib.pyplot as plt, seaborn as sns, PCA from sklearn.decomposition.

Instructions 1/3

undefined XP
    1
    2
    3
  • Assign the target variable values to the list targets.
  • Pass the lists just created to the zip() function inside the for loop.
  • Pass the instances where Loan Status is equal to target to indicesToKeep.