Factorize, round two
In the last exercise you learned how to import a data file using the command read_sav()
. With SPSS data files, it can also happen that some of the variables you import have the labelled
class. This is done to keep all the labelling information that was originally present in the .sav
and .por
files. It's advised to coerce (or change) these variables to factors or other standard R classes.
The data for this exercise involves information on employees and their demographic and economic attributes (Source: QRiE). The data can be found on the following URL:
https://assets.datacamp.com/production/course_1478/datasets/employee.sav
This exercise is part of the course
Intermediate Importing Data in R
Exercise instructions
- Import the SPSS data straight from the URL and store the resulting data frame as
work
. - Display the summary of the
GENDER
column ofwork
. This information doesn't give you a lot of useful information, right? - Convert the
GENDER
column inwork
to a factor, the class to denote categorical variables in R. Useas_factor()
. - Once again display the summary of the
GENDER
column. This time, the printout makes much more sense.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# haven is already loaded
# Import SPSS data from the URL: work
# Display summary of work$GENDER
# Convert work$GENDER to a factor
# Display summary of work$GENDER again