Aan de slagGa gratis aan de slag

Coding categorical features

Sometimes a dataset contains numeric values that represent a categorical feature.

In the donors dataset, wealth_rating uses numbers to indicate the donor's wealth level:

  • 0 = Unknown
  • 1 = Low
  • 2 = Medium
  • 3 = High

This exercise illustrates how to prepare this type of categorical feature and examines its impact on a logistic regression model. The donors data frame is available for you to use.

Deze oefening maakt deel uit van de cursus

Supervised Learning in R: Classification

Cursus bekijken

Oefeninstructies

  • Create a factor wealth_levels from the numeric wealth_rating with labels as shown by passing the factor() function the column you want to convert, the individual levels, and the labels.
  • Use relevel() to change the reference category to Medium. The first argument should be your new factor column.
  • Build a logistic regression model using the column wealth_levels to predict donated and display the result with summary().

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Convert the wealth rating to a factor
donors$wealth_levels <- ___(___, levels = ___, labels = ___)

# Use relevel() to change reference category
donors$wealth_levels <- ___(___, ref = ___)

# See how our factor coding impacts the model
summary(___)
Code bewerken en uitvoeren