Exercise

Resolving impartial duplicates

The parking_violation dataset has been modified to include a fee column indicating the fee for the violation. This column would be useful for keeping track of New York City parking ticket revenue. However, due to duplicated violation records, revenue calculations based on the dataset would not be accurate. These duplicate records only differ based on the value in the fee column. All other column values are shared in the duplicated records. A decision has been made to use the minimum fee to resolve the ambiguity created by these duplicates.

Identify the 3 duplicated parking_violation records and use the MIN() function to determine the fee that will be used after removing the duplicate records.

Instructions

100 XP
  • Return the summons_number and the minimum fee for duplicated records.
  • Group the results by summons_number.
  • Restrict the results to records having a summons_number count that is greater than 1.