MulaiMulai sekarang secara gratis

Using PCA as an alternative to nearZeroVar()

An alternative to removing low-variance predictors is to run PCA on your dataset. This is sometimes preferable because it does not throw out all of your data: many different low variance predictors may end up combined into one high variance PCA variable, which might have a positive impact on your model's accuracy.

This is an especially good trick for linear models: the pca option in the preProcess argument will center and scale your data, combine low variance variables, and ensure that all of your predictors are orthogonal. This creates an ideal dataset for linear regression modeling, and can often improve the accuracy of your models.

Latihan ini adalah bagian dari kursus

Machine Learning with caret in R

Lihat Kursus

Petunjuk latihan

bloodbrain_x and bloodbrain_y are loaded in your workspace.

  • Fit a glm model to the full blood-brain dataset using the "pca" option to preProcess.
  • Print the model to the console and inspect the result.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Fit glm model using PCA: model
model <- train(
  x = ___, 
  y = ___,
  method = ___, 
  preProcess = ___
)

# Print model to console
Edit dan Jalankan Kode