Visualize random forest bike model predictions
In the previous exercise, you saw that the random forest bike model did better on the August data than the quasiposson model, in terms of RMSE.
In this exercise, you will visualize the random forest model's August predictions as a function of time. The corresponding plot from the quasipoisson model that you built in a previous exercise is available for you to compare.
Recall that the quasipoisson model mostly identified the pattern of slow and busy hours in the day, but it somewhat underestimated peak demands. You would like to see how the random forest model compares.
The data frame bikesAugust
(with predictions) has been made available for you. The plot quasipoisson_plot
of quasipoisson model predictions as a function of time is shown.
This exercise is part of the course
Supervised Learning in R: Regression
Exercise instructions
- Fill in the blanks to plot the predictions and actual counts by hour for the first 14 days of August.
pivot_longer
thecnt
andpred
columns into a column calledvalue
, with a key calledvaluetype
.- Plot
value
as a function ofinstant
(day).
How does the random forest model compare?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
first_two_weeks <- bikesAugust %>%
mutate(rf = bike_outcomesAugust$rf) %>%
# Set start to 0, convert unit to days
mutate(instant = (instant - min(instant)) / 24) %>%
# Filter for rows in the first two weeks
filter(instant < 14)
# collect cnt and pred into a column named value with key valuetype
pivot_longer(c('cnt', 'rf'), names_to = '___', values_to = '___')
# Plot predictions and cnt by date/time
ggplot(___, aes(x = ___, y = ___, color = valuetype, linetype = valuetype)) +
geom_point() +
geom_line() +
scale_x_continuous("Day", breaks = 0:14, labels = 0:14) +
scale_color_brewer(palette = "Dark2") +
ggtitle("Predicted August bike rentals, Random Forest plot")