Visualize the bike rental predictions
In the previous exercise, you visualized the bike model's predictions using the standard "outcome vs. prediction" scatter plot. Since the bike rental data is time series data, you might be interested in how the model performs as a function of time. In this exercise, you will compare the predictions and actual rentals on an hourly basis, for the first 14 days of August.
To create the plot you will use the function tidyr::pivot_longer()
(docs) to consolidate the predicted and actual values from bikesAugust
in a single column. pivot_longer()
takes as arguments:
- The "wide" data frame to be pivoted (implicit in a pipe)
- The names of the columns to be collected into a single column (keyword "cols").
- The name of the key column to be created - contains the names of the collected columns (keyword "names_to").
- The name of the value column to be created - contains the values of the collected columns (keyword "values_to").
You'll use the pivoted data frame to compare the actual and predicted rental counts as a function of time. The time index, instant
counts the number of observations since the beginning of data collection. The sample code converts the instants to daily units, starting from 0.
The bikesAugust
data frame, with the predictions (bikesAugust$pred
), has been pre-loaded.
Cet exercice fait partie du cours
Supervised Learning in R: Regression
Instructions
- Fill in the blanks to plot the predictions and actual counts by hour for the first 14 days of August.
- convert
instant
to be in day units, rather than hour pivot_longer()
thecnt
andpred
columns into a column calledvalue
, with a key calledvaluetype
.filter()
for the first two weeks of August- Plot
value
as a function ofinstant
(day).
- convert
Does the model see the general time patterns in bike rentals?
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Plot predictions and cnt by date/time
bikesAugust %>%
# set start to 0, convert unit to days
mutate(instant = (instant - min(instant))/24) %>%
# collect cnt and pred into a value column
pivot_longer(cols = c('cnt', 'pred'), names_to = '___', values_to = '___') %>%
filter(instant < 14) %>% # restric to first 14 days
# plot value by instant
ggplot(aes(x = ___, y = ___, color = valuetype, linetype = valuetype)) +
geom_point() +
geom_line() +
scale_x_continuous("Day", breaks = 0:14, labels = 0:14) +
scale_color_brewer(palette = "Dark2") +
ggtitle("Predicted August bike rentals, Quasipoisson model")