Get startedGet started for free

Factorize, round two

In the last exercise you learned how to import a data file using the command read_sav(). With SPSS data files, it can also happen that some of the variables you import have the labelled class. This is done to keep all the labelling information that was originally present in the .sav and .por files. It's advised to coerce (or change) these variables to factors or other standard R classes.

The data for this exercise involves information on employees and their demographic and economic attributes (Source: QRiE). The data can be found on the following URL:

https://assets.datacamp.com/production/course_1478/datasets/employee.sav

This exercise is part of the course

Intermediate Importing Data in R

View Course

Exercise instructions

  • Import the SPSS data straight from the URL and store the resulting data frame as work.
  • Display the summary of the GENDER column of work. This information doesn't give you a lot of useful information, right?
  • Convert the GENDER column in work to a factor, the class to denote categorical variables in R. Use as_factor().
  • Once again display the summary of the GENDER column. This time, the printout makes much more sense.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# haven is already loaded

# Import SPSS data from the URL: work


# Display summary of work$GENDER


# Convert work$GENDER to a factor


# Display summary of work$GENDER again
Edit and Run Code