Graphical displays of longitudinal data: The magical gather()
To be able to study the possible diffecences in the bprs value between the treatment groups and the possible change of the value in time, we don't want the weeks to be individual variables. In comes the gather()
function which you can probably remember from previous chapters.
The gather()
function takes multiple columns and collapses them into key-value pairs, so that we can have the weeks as values of a new variable week. You can find more information about gather in the package documentation with ?gather or in the dplyr cheatsheet.
Our weeks
are in a bit inconvenient form as characters, so we somehow need to extract the week numbers from the character vector weeks
.
With the substr()
function we can extract a part of longer character object. We simply supply it with a character object or vector, start position, as in the position of the first letter to extract and stop position, as in the position of the last letter to extract. For example substr("Hello world!", 1, 5)
would return "Hello".
This exercise is part of the course
Helsinki Open Data Science
Exercise instructions
- Factor variables treatment and subject
- Use
gather()
to convert BPRS to a long form - Use
mutate()
andsubstr()
to create columnweek
by extracting the week number from columnweeks
- Glimpse the data using
glimpse()
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The data BPRS is available
# Access the packages dplyr and tidyr
library(dplyr)
library(tidyr)
# Factor treatment & subject
BPRS$treatment <- factor(BPRS$treatment)
BPRS$subject <- factor(BPRS$subject)
# Convert to long form
BPRSL <- BPRS %>% gather(key = weeks, value = bprs, -treatment, -subject)
# Extract the week number
BPRSL <- BPRSL %>% mutate(week = as.integer(substr("Change me!")))
# Take a glimpse at the BPRSL data
glimpse(BPRSL)