Survey analysis
A categorical variable is a variable that can take on one of a limited number of possible values.
Let's practice wrangling categorical data before your interview on the survey
dataset from the MASS package.
The dataset contains responses of statistics students to several questions.
One of the questions concerns how often the students exercise. The answers to this question are stored in the Exer
column.
The available responses were:
"None"
"Some"
, and"Freq"
(frequently).
Note that these answers can be put in order.
Recall that tapply()
applies a function to each group of values within categories.
For example,
tapply(survey$Age, survey$Sex, median)
computes median age split by sex.
The dataset has been pre-loaded and is stored in the survey
variable.
This exercise is part of the course
Practicing Statistics Interview Questions in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Return the structure of Exer
str(___)
# Create the ordered factor
survey$Exer_ordered <- ___(survey$Exer, ___ = c("None", ___, ___), ordered = ___)
# Return the structure of Exer_ordered
___(___)