Mutations
Mutating a data frame means adding new variables as mutations of the existing ones. The mutate()
function is from the 'dplyr' package which is part of the 'tidyverse' packages. The tidyverse includes several packages that work well together, such as 'dplyr' and 'ggplot2'.
The tidyverse functions have a lot of similarities. For example, the first argument of the tidyverse functions is usually data
. They also have other consistent features which makes them work well together and easy to use.
This exercise is part of the course
Helsinki Open Data Science
Exercise instructions
- Mutate
alc
by creating the new columnalc_use
by averaging weekday and weekend alcohol consumption. - Draw a bar plot of
alc_use
. - Define a new asthetic element to the bar plot of
alc_use
by definingfill = sex
. Draw the plot again. - Adjust the code: Mutate
alc
by creating a new columnhigh_use
, which is true ifalc_use
is greater than 2 and false otherwise. - Initialize a ggplot object with
high_use
on the x-axis and then draw a bar plot. - Add this element to the latter plot (using
+
):facet_wrap("sex")
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# alc is available
# access the 'tidyverse' packages dplyr and ggplot2
library(dplyr); library(ggplot2)
# define a new column alc_use by combining weekday and weekend alcohol use
alc <- mutate(alc, alc_use = (Dalc + Walc) / 2)
# initialize a plot of alcohol use
g1 <- ggplot(data = alc, aes(x = alc_use))
# define the plot as a bar plot and draw it
g1 + geom_bar()
# define a new logical column 'high_use'
alc <- mutate(alc, high_use = "change me!" > 2)
# initialize a plot of 'high_use'
g2 <- ggplot(data = alc)
# draw a bar plot of high_use by sex