Exploring the countries
Now that we have sufficiently wrangled the 'human' data for further analysis, let's explore the variables and their relationships more closely.
A simple pairs plot or a more informative generalized pairs plot from the GGally package is a good way of visualizing a reasonably sized data frame.
To study linear connections, correlations also can be computed with the cor()
function and then visualized with the corrplot function from the corrplot package.
This exercise is part of the course
Helsinki Open Data Science
Exercise instructions
- Create the data frame
human_
by removing theCountry
variable fromhuman
(the countries are still the row names) - Access the GGally package and visualize all the
human_
variables withggpairs()
. - Compute and print out the correlation matrix of
human_
- Adjust the code: use the pipe operator (
%>%
) and visualize the correlation matrix withcorrplot()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# modified human, dplyr and the corrplot functions are available
# remove the Country variable
human_ <- select(human, -Country)
# Access GGally
library(GGally)
# visualize the 'human_' variables
# compute the correlation matrix and visualize it with corrplot
cor(human_)