Build a random forest model for bike rentals
In this exercise, you will again build a model to predict the number of bikes rented in an hour as a function of the weather, the type of day (holiday, working day, or weekend), and the time of day. You will train the model on data from the month of July.
You will use the ranger package to fit the random forest model. For this exercise, the key arguments to the ranger() (docs) call are:
formuladatanum.trees: the number of trees in the forest.respect.unordered.factors: Specifies how to treat unordered factor variables. We recommend setting this to "order" for regression.seed: because this is a random algorithm, you will set the seed to get reproducible results
Since there are a lot of input variables, for convenience we will specify the outcome and the inputs in the variables outcome and vars,
and use paste() (docs) to assemble a string representing the model formula.
The data frame bikesJuly has been pre-loaded. The sample code specifies the names of the outcome and input variables.
Este ejercicio forma parte del curso
Supervised Learning in R: Regression
Instrucciones del ejercicio
- Fill in the blanks to create the formula
fmlaexpressingcntas a function of the inputs. Print it. - Load the package
ranger. - Use
rangerto fit a model to thebikesJulydata:bike_model_rf.- The first argument to
ranger()is the formula,fmla. - Use 500 trees and
respect.unordered.factors = "order". - Set the seed to
seedfor reproducible results. - Print the model. What is the R-squared?
- The first argument to
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# bikesJuly is available
str(bikesJuly)
# Random seed to reproduce results
seed
# The outcome column
(outcome <- "cnt")
# The input variables
(vars <- c("hr", "holiday", "workingday", "weathersit", "temp", "atemp", "hum", "windspeed"))
# Create the formula string for bikes rented as a function of the inputs
(fmla <- paste(___, "~", paste(___, collapse = " + ")))
# Load the package ranger
___
# Fit and print the random forest model
(bike_model_rf <- ranger(___, # formula
___, # data
num.trees = ___,
respect.unordered.factors = ___,
seed = ___))