Back to the future
A new update to the data pipeline feeding into the ride_sharing DataFrame has been updated to register each ride's date. This information is stored in the ride_date column of the type object, which represents strings in pandas.
A bug was discovered which was relaying rides taken today as taken next year. To fix this, you will find all instances of the ride_date column that occur anytime in the future, and set the maximum possible value of this column to today's date. Before doing so, you would need to convert ride_date to a datetime object.
The datetime package has been imported as dt, alongside all the packages you've been using till now.
This exercise is part of the course
Cleaning Data in Python
Exercise instructions
- Convert
ride_dateto adatetimeobject usingto_datetime(), then convert thedatetimeobject into adateand store it inride_dtcolumn. - Create the variable
today, which stores today's date by using thedt.date.today()function. - For all instances of
ride_dtin the future, set them to today's date. - Print the maximum date in the
ride_dtcolumn.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Convert ride_date to date
ride_sharing['ride_dt'] = pd.____(____['____']).dt.date
# Save today's date
today = ____
# Set all in the future to today's date
ride_sharing.____[____['____'] > ____, '____'] = ____
# Print maximum of ride_dt column
print(ride_sharing['ride_dt'].____())