Separating house prices with PCA
PCA and t-SNE are both feature extraction techniques, but PCA can only capture the linear structure of the data. In this exercise, you will create a PCA plot of the full house_sales_df
so you can compare its result with the t-SNE output.
Remember that price
is the target variable in house_sales_df
. It is important to remove it before fitting PCA to the data.
The tidyverse
and ggfortify
packages have been loaded for you.
This exercise is part of the course
Dimensionality Reduction in R
Exercise instructions
- Fit a PCA to the predictors of
house_sales_df
. - Use
autoplot()
to plot the first two PCs and encode price in color.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Fit PCA to only the predictors
pca <- ___(___ %>% select(-___))
# Plot PCA and color code the target variable
___(___, data = ___, colour = "___", alpha = 0.7) +
scale_color_gradient(low="gray", high="blue")