ComenzarEmpieza gratis

A biplot of PCA

A biplot is a way of visualizing the connections between two representations of the same data. First, a simple scatter plot is drawn where the observations are represented by two principal components (PC's). Then, arrows are drawn to visualize the connections between the original variables and the PC's. The following connections hold:

  • The angle between the arrows can be interpret as the correlation between the variables.
  • The angle between a variable and a PC axis can be interpret as the correlation between the two.
  • The length of the arrows are proportional to the standard deviations of the variables

Este ejercicio forma parte del curso

Helsinki Open Data Science

Ver curso

Instrucciones del ejercicio

  • Create and print out a summary of pca_human (created in the previous exercise)
  • Create object pca_pr and print it out
  • Adjust the code: instead of proportions of variance, save the percentages of variance in the pca_pr object. Round the percentages to 1 digit.
  • Execute the paste0() function. Then create a new object pc_lab by assigning the output to it.
  • Draw the biplot again. Use the first value of the pc_lab vector as the label for the x-axis and the second value as the label for the y-axis.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# pca_human, dplyr are available

# create and print out a summary of pca_human
s <- summary(pca_human)


# rounded percetanges of variance captured by each PC
pca_pr <- round(1*s$importance[2, ], digits = 5)

# print out the percentages of variance


# create object pc_lab to be used as axis labels
paste0(names(pca_pr), " (", pca_pr, "%)")

# draw a biplot
biplot(pca_human, cex = c(0.8, 1), col = c("grey40", "deeppink2"), xlab = NA, ylab = NA)



Editar y ejecutar código