Get startedGet started for free

Mutations

Mutating a data frame means adding new variables as mutations of the existing ones. The mutate() function is from the 'dplyr' package which is part of the 'tidyverse' packages. The tidyverse includes several packages that work well together, such as 'dplyr' and 'ggplot2'.

The tidyverse functions have a lot of similarities. For example, the first argument of the tidyverse functions is usually data. They also have other consistent features which makes them work well together and easy to use.

This exercise is part of the course

Helsinki Open Data Science

View Course

Exercise instructions

  • Mutate alc by creating the new column alc_use by averaging weekday and weekend alcohol consumption.
  • Draw a bar plot of alc_use.
  • Define a new asthetic element to the bar plot of alc_use by defining fill = sex. Draw the plot again.
  • Adjust the code: Mutate alc by creating a new column high_use, which is true if alc_use is greater than 2 and false otherwise.
  • Initialize a ggplot object with high_use on the x-axis and then draw a bar plot.
  • Add this element to the latter plot (using +): facet_wrap("sex").

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# alc is available

# access the 'tidyverse' packages dplyr and ggplot2
library(dplyr); library(ggplot2)

# define a new column alc_use by combining weekday and weekend alcohol use
alc <- mutate(alc, alc_use = (Dalc + Walc) / 2)

# initialize a plot of alcohol use
g1 <- ggplot(data = alc, aes(x = alc_use))

# define the plot as a bar plot and draw it
g1 + geom_bar()

# define a new logical column 'high_use'
alc <- mutate(alc, high_use = "change me!" > 2)

# initialize a plot of 'high_use'
g2 <- ggplot(data = alc)

# draw a bar plot of high_use by sex

Edit and Run Code