Performing PCA
The next step in your analysis is to perform PCA on wisc.data
.
You saw in the last chapter that it's important to check if the data need to be scaled before performing PCA. Recall two common reasons for scaling data:
- The input variables use different units of measurement.
- The input variables have significantly different variances.
This exercise is part of the course
Unsupervised Learning in R
Exercise instructions
The variables you created before, wisc.data
and diagnosis
, are still available in your workspace.
- Check the mean and standard deviation of the features of the data to determine if the data should be scaled. Use the
colMeans()
andapply()
functions like you've done before. - Execute PCA on the
wisc.data
, scaling if appropriate, and assign the model towisc.pr
. - Inspect a summary of the results with the
summary()
function.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Check column means and standard deviations
# Execute PCA, scaling if appropriate: wisc.pr
# Look at summary of results