Build a random forest model for bike rentals
In this exercise, you will again build a model to predict the number of bikes rented in an hour as a function of the weather, the type of day (holiday, working day, or weekend), and the time of day. You will train the model on data from the month of July.
You will use the ranger package to fit the random forest model. For this exercise, the key arguments to the ranger() (docs) call are:
- formula
- data
- num.trees: the number of trees in the forest.
- respect.unordered.factors: Specifies how to treat unordered factor variables. We recommend setting this to "order" for regression.
- seed: because this is a random algorithm, you will set the seed to get reproducible results
Since there are a lot of input variables, for convenience we will specify the outcome and the inputs in the variables outcome and vars,
and use paste() (docs) to assemble a string representing the model formula.
The data frame bikesJuly has been pre-loaded. The sample code specifies the names of the outcome and input variables.
Este exercício faz parte do curso
Supervised Learning in R: Regression
Instruções do exercício
- Fill in the blanks to create the formula fmlaexpressingcntas a function of the inputs. Print it.
- Load the package ranger.
- Use rangerto fit a model to thebikesJulydata:bike_model_rf.- The first argument to ranger()is the formula,fmla.
- Use 500 trees and respect.unordered.factors = "order".
- Set the seed to seedfor reproducible results.
- Print the model. What is the R-squared?
 
- The first argument to 
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# bikesJuly is available
str(bikesJuly)
# Random seed to reproduce results
seed
# The outcome column
(outcome <- "cnt")
# The input variables
(vars <- c("hr", "holiday", "workingday", "weathersit", "temp", "atemp", "hum", "windspeed"))
# Create the formula string for bikes rented as a function of the inputs
(fmla <- paste(___, "~", paste(___, collapse = " + ")))
# Load the package ranger
___
# Fit and print the random forest model
(bike_model_rf <- ranger(___, # formula 
                         ___, # data
                         num.trees = ___, 
                         respect.unordered.factors = ___, 
                         seed = ___))