Combining preprocessing methods
The preProcess
argument to train()
doesn't just limit you to imputing missing values. It also includes a wide variety of other preProcess
techniques to make your life as a data scientist much easier. You can read a full list of them by typing ?preProcess
and reading the help page for this function.
One set of preprocessing functions that is particularly useful for fitting regression models is standardization: centering and scaling. You first center by subtracting the mean of each column from each value in that column, then you scale by dividing by the standard deviation.
Standardization transforms your data such that for each column, the mean is 0 and the standard deviation is 1. This makes it easier for regression models to find a good solution.
Este ejercicio forma parte del curso
Machine Learning with caret in R
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Fit glm with median imputation
model <- train(
x = ___,
y = ___,
method = ___,
trControl = myControl,
preProcess = ___
)
# Print model