Identify highly correlated features
Using the data in house_sales_df
, you will practice identifying features that have high correlation. High correlation among features indicates redundant information and can cause problems in modeling such as multicollinearity in regression models. You will determine which of the highly correlated features to remove. A correlation matrix will help you identify highly correlated features.
The tidyverse
and corrr
packages have been loaded for you.
Diese Übung ist Teil des Kurses
Dimensionality Reduction in R
Anleitung zur Übung
- Create a correlation plot with the correlations printed on the plot.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Create a correlation plot of the house sales
house_sales_df %>%
___() %>%
___() %>%
___(print_cor = ___) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))