Coding categorical features

Sometimes a dataset contains numeric values that represent a categorical feature.

In the donors dataset, wealth_rating uses numbers to indicate the donor's wealth level:

0 = Unknown
1 = Low
2 = Medium
3 = High

This exercise illustrates how to prepare this type of categorical feature and examines its impact on a logistic regression model. The donors data frame is available for you to use.

This exercise is part of the course

Supervised Learning in R: Classification

View Course

Exercise instructions

Create a factor wealth_levels from the numeric wealth_rating with labels as shown by passing the factor() function the column you want to convert, the individual levels, and the labels.
Use relevel() to change the reference category to Medium. The first argument should be your new factor column.
Build a logistic regression model using the column wealth_levels to predict donated and display the result with summary().

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Convert the wealth rating to a factor
donors$wealth_levels <- ___(___, levels = ___, labels = ___)

# Use relevel() to change reference category
donors$wealth_levels <- ___(___, ref = ___)

# See how our factor coding impacts the model
summary(___)

Edit and Run Code