Interpreting PCA results
Now you'll use some visualizations to better understand your PCA model. You were introduced to one of these visualizations, the biplot, in an earlier chapter.
You'll run into some common challenges with using biplots on real-world data containing a non-trivial number of observations and variables, then you'll look at some alternative visualizations. You are encouraged to experiment with additional visualizations before moving on to the next exercise.
Diese Übung ist Teil des Kurses
Unsupervised Learning in R
Anleitung zur Übung
The variables you created before, wisc.data
, diagnosis
, and wisc.pr
, are still available.
- Create a biplot of the
wisc.pr
data. What stands out to you about this plot? Is it easy or difficult to understand? Why? - Execute the code to scatter plot each observation by principal components 1 and 2, coloring the points by the diagnosis.
- Repeat the same for principal components 1 and 3. What do you notice about these plots?
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Create a biplot of wisc.pr
# Scatter plot observations by components 1 and 2
plot(wisc.pr$___[, c(1, 2)], col = (diagnosis + 1),
xlab = "PC1", ylab = "PC2")
# Repeat for components 1 and 3
plot(___, col = (diagnosis + 1),
xlab = "PC1", ylab = "PC3")
# Do additional data exploration of your choosing below (optional)