Aan de slagGa gratis aan de slag

Identify highly correlated features

Using the data in house_sales_df, you will practice identifying features that have high correlation. High correlation among features indicates redundant information and can cause problems in modeling such as multicollinearity in regression models. You will determine which of the highly correlated features to remove. A correlation matrix will help you identify highly correlated features.

The tidyverse and corrr packages have been loaded for you.

Deze oefening maakt deel uit van de cursus

Dimensionality Reduction in R

Cursus bekijken

Oefeninstructies

  • Create a correlation plot with the correlations printed on the plot.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create a correlation plot of the house sales
house_sales_df %>% 
  ___() %>% 
  ___() %>% 
  ___(print_cor = ___) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))
Code bewerken en uitvoeren