Back to the future
A new update to the data pipeline feeding into the ride_sharing
DataFrame has been updated to register each ride's date. This information is stored in the ride_date
column of the type object
, which represents strings in pandas
.
A bug was discovered which was relaying rides taken today as taken next year. To fix this, you will find all instances of the ride_date
column that occur anytime in the future, and set the maximum possible value of this column to today's date. Before doing so, you would need to convert ride_date
to a datetime
object.
The datetime
package has been imported as dt
, alongside all the packages you've been using till now.
This exercise is part of the course
Cleaning Data in Python
Exercise instructions
- Convert
ride_date
to adatetime
object usingto_datetime()
, then convert thedatetime
object into adate
and store it inride_dt
column. - Create the variable
today
, which stores today's date by using thedt.date.today()
function. - For all instances of
ride_dt
in the future, set them to today's date. - Print the maximum date in the
ride_dt
column.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Convert ride_date to date
ride_sharing['ride_dt'] = pd.____(____['____']).dt.date
# Save today's date
today = ____
# Set all in the future to today's date
ride_sharing.____[____['____'] > ____, '____'] = ____
# Print maximum of ride_dt column
print(ride_sharing['ride_dt'].____())