vtreat the bike rental data
In this exercise, you will create one-hot-encoded data frames of the July/August bike data, for use with xgboost later on.
The data frames bikesJuly and bikesAugust have been pre-loaded. 
For your convenience, we have defined the variable vars with the list of variable columns for the model.
Este exercício faz parte do curso
Supervised Learning in R: Regression
Instruções do exercício
- Load the package vtreat.
- Use designTreatmentsZ()to create a treatment plantreatplanfor the variables invarsfrombikesJuly(the training data).- Set the flag verbose=FALSEto prevent the function from printing too many messages.
 
- Set the flag 
- Fill in the blanks to create a vector newvarsthat contains only the names of thecleanandlevtransformed variables. Print it.
- Use prepare()to create a one-hot-encoded training data framebikesJuly.treat.- Use the varRestrictionsargument to restrict the variables you will use tonewvars.
 
- Use the 
- Use prepare()to create a one-hot-encoded test framebikesAugust.treatfrombikesAugustin the same way.
- Call str()on both prepared test frames to see the structure.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# The outcome column
(outcome <- "cnt")
# The input columns
(vars <- c("hr", "holiday", "workingday", "weathersit", "temp", "atemp", "hum", "windspeed"))
# Load the package vtreat
___
# Create the treatment plan from bikesJuly (the training data)
treatplan <- ___(___, ___, verbose = FALSE)
# Get the "clean" and "lev" variables from the scoreFrame
(newvars <- treatplan %>%
  use_series(scoreFrame) %>%        
  filter(code %in% ___) %>%  # get the rows you care about
  use_series(___))           # get the varName column
# Prepare the training data
bikesJuly.treat <- ___(___, ___,  varRestriction = ___)
# Prepare the test data
bikesAugust.treat <- ___(___, ___,  varRestriction = ___)
# Call str() on the treated data
___
___