Standardizing features
It is important to ensure that the feature inputs to the kNN distance calculation are standardized using the scale() function. Standardization ensures that features with large mean or variance do not disproportionately influence the kNN distance score.
Diese Übung ist Teil des Kurses
Introduction to Anomaly Detection in R
Anleitung zur Übung
- Apply the
summary()function to thewinedata to calculate the mean, minimum and maximum values forpHandalcohol. - Use the
scale()function to create a standardized version of thewinedata calledwine_scaled. - Use the
summary()function towine_scaledto check that the mean and ranges have changed.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Without standardization, features have different scales
summary(wine)
# Standardize the wine columns
wine_scaled <- ___
# Standardized features have similar means and quartiles
___(___)