Exercise

# Use KNN imputation

In the previous exercise, you used median imputation to fill in missing values in the breast cancer dataset, but that is not the only possible method for dealing with missing data.

An alternative to median imputation is k-nearest neighbors, or KNN, imputation. This is a more advanced form of imputation where missing values are replaced with values from other rows that are similar to the current row. While this is a lot more complicated to implement in practice than simple median imputation, it is very easy to explore in `caret`

using the `preProcess`

argument to `train()`

. You can simply use `preProcess = "knnImpute"`

to change the method of imputation used prior to model fitting.

Instructions

**100 XP**

`breast_cancer_x`

and `breast_cancer_y`

are loaded in your workspace.

- Use the
`train()`

function to fit a`glm`

model called`knn_model`

to the breast cancer dataset. - Use KNN imputation to handle missing values.