Aan de slagGa gratis aan de slag

Scale the data for lasso regression

To prepare to fit a lasso regression model, it is important to scale the data so that all features are comparable among each other. The full set of King County, California house sales data is available in house_sales_df.

In this exercise, you will scale the target variable, price, separately before you split the data into training and testing sets. This is because of the way tidymodels recipes work. We don't include target variable transformations in the recipe.

The tidyverse and tidymodels packages have been loaded for you.

Deze oefening maakt deel uit van de cursus

Dimensionality Reduction in R

Cursus bekijken

Oefeninstructies

  • Scale the target variable price in house_sales_df using scale().
  • Create the training and testing sets with 80% in the training set.
  • Create the recipe using the training data to scale all numeric predictors.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Scale the target variable
house_sales_df <-  ___ %>% 
  mutate(price = as.vector(___(___)))

# Create the training and testing sets
split <- ___(___, prop = ___)
train <- ___ %>% ___()
test <-  ___ %>% ___()

# Create recipe to scale the predictors
lasso_recipe <- 
  ___(___ ~ ., data = ___) %>% 
  ___(___()) 
Code bewerken en uitvoeren