Session Ready
Exercise

Visualize the bike rental predictions

In the previous exercise, you visualized the bike model's predictions using the standard "outcome vs. prediction" scatter plot. Since the bike rental data is time series data, you might be interested in how the model performs as a function of time. In this exercise, you will compare the predictions and actual rentals on an hourly basis, for the first 14 days of August.

To create the plot you will use the function tidyr::gather() to consolidate the predicted and actual values from bikesAugust in a single column. gather() takes as arguments:

  • The "wide" data frame to be gathered (implicit in a pipe)
  • The name of the key column to be created - contains the names of the gathered columns.
  • The name of the value column to be created - contains the values of the gathered columns.
  • The names of the columns to be gathered into a single column.

You'll use the gathered data frame to compare the actual and predicted rental counts as a function of time. The time index, instant counts the number of observations since the beginning of data collection. The sample code converts the instants to daily units, starting from 0.

Instructions
100 XP

The data frame bikesAugust, with the predictions (bikesAugust$pred) is in the workspace.

  • Fill in the blanks to plot the predictions and actual counts by hour for the first 14 days of August.
    • convert instant to be in day units, rather than hour
    • gather() the cnt and pred columns into a column called value, with a key called valuetype.
    • filter() for the first two weeks of August
    • Plot value as a function of instant (day).

Does the model see the general time patterns in bike rentals?