ComenzarEmpieza gratis

Practical issues: scaling

You saw in the video that scaling your data before doing PCA changes the results of the PCA modeling. Here, you will perform PCA with and without scaling, then visualize the results using biplots.

Sometimes scaling is appropriate when the variances of the variables are substantially different. This is commonly the case when variables have different units of measurement, for example, degrees Fahrenheit (temperature) and miles (distance). Making the decision to use scaling is an important step in performing a principal component analysis.

Este ejercicio forma parte del curso

Unsupervised Learning in R

Ver curso

Instrucciones del ejercicio

The same Pokemon dataset is available in your workspace as pokemon, but one new variable has been added: Total.

  • There is some code at the top of the editor to calculate the mean and standard deviation of each variable in the model. Run this code to see how the scale of the variables differs in the original data.
  • Create a PCA model of pokemon with scaling, assigning the result to pr.with.scaling.
  • Create a PCA model of pokemon without scaling, assigning the result to pr.without.scaling.
  • Use biplot() to plot both models (one at a time) and compare their outputs.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Mean of each variable
colMeans(pokemon)

# Standard deviation of each variable
apply(pokemon, 2, sd)

# PCA model with scaling: pr.with.scaling


# PCA model without scaling: pr.without.scaling


# Create biplots of both for comparison

Editar y ejecutar código