EDA plots II
Another idea that comes to mind is that the price of a ride could change during the day.
Your goal is to plot the median fare amount for each hour of the day as a simple line plot. The hour feature is calculated for you. Don't worry if you do not know how to work with the date features. We will explore them in the chapter on Feature Engineering.
This exercise is part of the course
Winning a Kaggle Competition in Python
Exercise instructions
- Group
train
DataFrame by"hour"
and calculate the median for the"fare_amount"
column. - Using
hour_price
DataFrame obtained, plot a line with"hour"
on the x axis and"fare_amount"
on the y axis.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create hour feature
train['pickup_datetime'] = pd.to_datetime(train.pickup_datetime)
train['hour'] = train.pickup_datetime.dt.hour
# Find median fare_amount for each hour
hour_price = train.____('____', as_index=False)['____'].____()
# Plot the line plot
plt.plot(hour_price[____], hour_price[____], marker='o')
plt.xlabel('Hour of the day')
plt.ylabel('Median fare amount')
plt.title('Fare amount based on day time')
plt.xticks(range(24))
plt.show()