Transformations
Highly skewed distributions can make it very difficult to learn anything from a visualization. Transformations can be helpful in revealing the more subtle structure.
Here you'll focus on the population variable, which exhibits strong right skew, and transform it with the natural logarithm function (log()
in R).
This exercise is part of the course
Exploratory Data Analysis in R
Exercise instructions
Using the gap2007
data:
- Create a density plot of the population variable.
- Mutate a new column called
log_pop
that is the natural log of the population and save it back intogap2007
. - Create a density plot of your transformed variable.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create density plot of old variable
gap2007 %>%
ggplot(aes(x = ___)) +
___
# Transform the skewed pop variable
gap2007 <- gap2007 %>%
mutate(___)
# Create density plot of new variable
gap2007 %>%
ggplot(aes(x = ___)) +
___