KNN imputation of categorical values
Once all the categorical columns in the DataFrame have been converted to ordinal values, the DataFrame is ready to be imputed. Imputing using statistical models like K-Nearest Neighbors (KNN) provides better imputations.
In this exercise, you'll
- Use the
KNN()
function fromfancyimpute
to impute the missing values in the ordinally encoded DataFrameusers
. - Convert the ordinal values back to their respective categories using the ordinal encoder's
.inverse_transform()
method.
Remember, ordinal_enc_dict
stores sklearn
's OrdinalEncoder()
for each column.
The users
DataFrame stores the encoded values (ordinal values) for each column.
The KNN()
function, the dictionary of OrdinalEncoder()
s ordinal_enc_dict
and the users
DataFrame have already been loaded for you.
Este exercício faz parte do curso
Dealing with Missing Data in Python
Instruções do exercício
- Impute the
users
DataFrame usingKNN_imputer
'sfit_transform()
method. These transformed values are rounded to get integers. - Iterate over columns in
users
. - Select the column's
OrdinalEncoder()
fromordinal_enc_dict
and perform.inverse_transform()
on the reshaped arrayreshaped
.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create KNN imputer
KNN_imputer = KNN()
# Impute 'users' DataFrame. It is rounded to get integer values
users_KNN_imputed.iloc[:, :] = np.round(___)
# Loop over the column names in 'users'
for col_name in ___:
# Reshape the column data
reshaped = users_KNN_imputed[col_name].values.reshape(-1, 1)
# Select the column's Encoder and perform inverse transform on 'reshaped'
users_KNN_imputed[col_name] = ___