CommencerCommencer gratuitement

Relationship between trip duration and total fare

We would assume that there is a relationship between the total cab fare and the duration of the trip. Since there are too many data points to make a scatterplot, let's use a hexagon-binned plot to investigate this relationship.

tx is available for you in your workspace.

Cet exercice fait partie du cours

Visualizing Big Data with Trelliscope in R

Afficher le cours

Instructions

  • Use hexagon bins to visualize the bivariate distribution of total_amount (y-axis) vs. trip_duration (x-axis).
  • Set the bins argument of geom_hex() to 75.
  • Since both variables are highly skewed, rescale both the x and y axes to log base 10. Note that these transformations will generate some warnings about a relatively small number of records with zero trip duration or fare amount.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

library(ggplot2)

# Create a hexagon-binned plot of total_amount vs. trip_duration
ggplot(tx, aes(___, ___)) +
  ___ +
  ___ +
  ___
Modifier et exécuter le code