ComenzarEmpieza gratis

Visualize the bike rental predictions

In the previous exercise, you visualized the bike model's predictions using the standard "outcome vs. prediction" scatter plot. Since the bike rental data is time series data, you might be interested in how the model performs as a function of time. In this exercise, you will compare the predictions and actual rentals on an hourly basis, for the first 14 days of August.

To create the plot you will use the function tidyr::pivot_longer() (docs) to consolidate the predicted and actual values from bikesAugust in a single column. pivot_longer() takes as arguments:

  • The "wide" data frame to be pivoted (implicit in a pipe)
  • The names of the columns to be collected into a single column (keyword "cols").
  • The name of the key column to be created - contains the names of the collected columns (keyword "names_to").
  • The name of the value column to be created - contains the values of the collected columns (keyword "values_to").

You'll use the pivoted data frame to compare the actual and predicted rental counts as a function of time. The time index, instant counts the number of observations since the beginning of data collection. The sample code converts the instants to daily units, starting from 0.

The bikesAugust data frame, with the predictions (bikesAugust$pred), has been pre-loaded.

Este ejercicio forma parte del curso

Supervised Learning in R: Regression

Ver curso

Instrucciones del ejercicio

  • Fill in the blanks to plot the predictions and actual counts by hour for the first 14 days of August.
    • convert instant to be in day units, rather than hour
    • pivot_longer() the cnt and pred columns into a column called value, with a key called valuetype.
    • filter() for the first two weeks of August
    • Plot value as a function of instant (day).

Does the model see the general time patterns in bike rentals?

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

# Plot predictions and cnt by date/time
bikesAugust %>% 
  # set start to 0, convert unit to days
  mutate(instant = (instant - min(instant))/24) %>%  
  # collect cnt and pred into a value column
  pivot_longer(cols = c('cnt', 'pred'), names_to = '___', values_to = '___') %>%  
  filter(instant < 14) %>% # restric to first 14 days
  # plot value by instant
  ggplot(aes(x = ___, y = ___, color = valuetype, linetype = valuetype)) + 
  geom_point() + 
  geom_line() + 
  scale_x_continuous("Day", breaks = 0:14, labels = 0:14) + 
  scale_color_brewer(palette = "Dark2") + 
  ggtitle("Predicted August bike rentals, Quasipoisson model")
Editar y ejecutar código