1. Learn
  2. /
  3. Courses
  4. /
  5. Dimensionality Reduction in R

Exercise

PCA in tidymodels

From a model building perspective, PCA allows you to create models with fewer features, but still capture most of the information in the original data. However, as you've seen, a disadvantage of PCA is the difficulty of interpreting the model. In this exercise, you will be focusing on building a linear regression model using a subset of the house sales data. The target variable is price.

A model built directly from the data without extracting principal components has a RMSE of $236,461.4. You will apply PCA with tidymodels and compare the new RMSE. Remember, lower RMSEs are better.

The tidyverse and tidymodels packages have been loaded for you.

Instructions

100 XP
  • Build a PCA recipe using train to extract five principal components.
  • Fit a workflow with a default linear_reg() model spec.
  • Create a test prediction data frame using test that contains the actual and predicted values.
  • Calculate the RMSE for the PCA-reduced linear regression model.