Distribution of cab fare amount
Let's learn about how much cab rides cost in NYC and look at a histogram of the total cab fare. Since the fare amount is likely to be highly skewed, we will plot it with the x-axis in the log scale.
The tx
data set is preloaded for you.
This exercise is part of the course
Visualizing Big Data with Trelliscope in R
Exercise instructions
- Plot the the distribution of the total cab fare,
total_amount
usinggeom_histogram()
. - In the last line, apply a log base 10 scale to the x-axis using
scale_x_log10()
. Note that you will receive a warning message about 62 data points that have a total fare of $0. These points are ignored since the logarithm is infinite.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
library(ggplot2)
# Create a histogram of total_amount
ggplot(___, aes(___)) +
___ +
___