Get startedGet started for free

Graphical displays of longitudinal data: The magical gather()

To be able to study the possible diffecences in the bprs value between the treatment groups and the possible change of the value in time, we don't want the weeks to be individual variables. In comes the gather() function which you can probably remember from previous chapters.

The gather() function takes multiple columns and collapses them into key-value pairs, so that we can have the weeks as values of a new variable week. You can find more information about gather in the package documentation with ?gather or in the dplyr cheatsheet.

Our weeks are in a bit inconvenient form as characters, so we somehow need to extract the week numbers from the character vector weeks.

With the substr() function we can extract a part of longer character object. We simply supply it with a character object or vector, start position, as in the position of the first letter to extract and stop position, as in the position of the last letter to extract. For example substr("Hello world!", 1, 5) would return "Hello".

This exercise is part of the course

Helsinki Open Data Science

View Course

Exercise instructions

  • Factor variables treatment and subject
  • Use gather() to convert BPRS to a long form
  • Use mutate() and substr() to create column week by extracting the week number from column weeks
  • Glimpse the data using glimpse()

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# The data BPRS is available

# Access the packages dplyr and tidyr
library(dplyr)
library(tidyr)

# Factor treatment & subject
BPRS$treatment <- factor(BPRS$treatment)
BPRS$subject <- factor(BPRS$subject)

# Convert to long form
BPRSL <-  BPRS %>% gather(key = weeks, value = bprs, -treatment, -subject)

# Extract the week number
BPRSL <-  BPRSL %>% mutate(week = as.integer(substr("Change me!")))

# Take a glimpse at the BPRSL data
glimpse(BPRSL)
Edit and Run Code