Aan de slagGa gratis aan de slag

Random Forest: visualization

Now you need to plot the predictions. With the gradient boosted trees model, you drew a scatter plot of predicted responses vs. actual responses, and a density plot of the residuals. You are now going to adapt those plots to display the results from both models at once.

Deze oefening maakt deel uit van de cursus

Introduction to Spark with sparklyr in R

Cursus bekijken

Oefeninstructies

A local tibble both_responses, containing predicted and actual years for both models, has been pre-defined.

  • Update the predicted vs. actual response scatter plot.
    • Use the both_responses dataset.
    • Add a color aesthetic to draw each model in a different color. Use color = model.
    • Rather than drawing the points, use geom_smooth() to draw a smooth curve for each model.
  • Create a tibble of residuals, named residuals.
    • Call mutate() on both_responses.
    • The new column should be called residual.
    • residual should be equal to the predicted response minus the actual response.
  • Update the residual density plot.
    • Add a color aesthetic to draw each model in a different color.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# both_responses has been pre-defined
both_responses

# Draw a scatterplot of predicted vs. actual
ggplot(___, aes(actual, predicted, ___)) +
  # Add a smoothed line
  ___ +
  # Add a line at actual = predicted
  geom_abline(intercept = 0, slope = 1)

# Create a tibble of residuals
residuals <- ___

# Draw a density plot of residuals
ggplot(residuals, aes(residual, ___)) +
    # Add a density curve
    geom_density() +
    # Add a vertical line through zero
    geom_vline(xintercept = 0)
Code bewerken en uitvoeren