Removing partial duplicates
Now that you've identified and removed the full duplicates, it's time to check for partial duplicates. Partial duplicates are a bit tricker to deal with than full duplicates. In this exercise, you'll first identify any partial duplicates and then practice the most common technique to deal with them, which involves dropping all partial duplicates, keeping only the first.
dplyr
is loaded and bike_share_rides
is available.
This exercise is part of the course
Cleaning Data in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Find duplicated ride_ids
bike_share_rides %>%
# Count the number of occurrences of each ride_id
___ %>%
# Filter for rows with a count > 1
___