Exercise

# Encode the labels as categorical variables

Remember, your ultimate goal is to predict the probability that a certain label is attached to a budget line item. You just saw that many columns in your data are the inefficient `object`

type. Does this include the labels you're trying to predict? Let's find out!

There are 9 columns of labels in the dataset. Each of these columns is a category that has many possible values it can take. The 9 labels have been loaded into a list called `LABELS`

. In the Shell, check out the type for these labels using `df[LABELS].dtypes`

.

You will notice that every label is encoded as an object datatype. Because `category`

datatypes are much more efficient your task is to convert the labels to category types using the `.astype()`

method.

Note: `.astype()`

only works on a pandas Series. Since you are working with a pandas DataFrame, you'll need to use the `.apply()`

method and provide a `lambda`

function called `categorize_label`

that applies `.astype()`

to each column, `x`

.

Instructions

**100 XP**

- Define the lambda function
`categorize_label`

to convert column`x`

into`x.astype('category')`

. - Use the
`LABELS`

list provided to convert the subset of data`df[LABELS]`

to categorical types using the`.apply()`

method and`categorize_label`

. Don't forget`axis=0`

. - Print the converted
`.dtypes`

attribute of`df[LABELS]`

.