Fit a random forest

As you saw in the video, random forest models are much more flexible than linear models, and can model complicated nonlinear effects as well as automatically capture interactions between variables. They tend to give very good results on real world data, so let's try one out on the wine quality dataset, where the goal is to predict the human-evaluated quality of a batch of wine, given some of the machine-measured chemical and physical properties of that batch.

Fitting a random forest model is exactly the same as fitting a generalized linear regression model, as you did in the previous chapter. You simply change the method argument in the train function to be "ranger". The ranger package is a rewrite of R's classic randomForest package and fits models much faster, but gives almost exactly the same results. We suggest that all beginners use the ranger package for random forest modeling.

Deze oefening maakt deel uit van de cursus

Machine Learning with caret in R

Cursus bekijken

Oefeninstructies

Train a random forest called model on the wine quality dataset, wine, such that quality is the response variable and all other variables are explanatory variables.
Use method = "ranger".
Use a tuneLength of 1.
Use 5 CV folds.
Print model to the console.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Fit random forest: model
model <- train(
  ___,
  tuneLength = ___,
  data = ___, 
  method = ___,
  trControl = trainControl(
    method = "cv", 
    number = ___, 
    verboseIter = TRUE
  )
)

# Print model to console

Code bewerken en uitvoeren