Resolving impartial duplicates
The parking_violation dataset has been modified to include a fee column indicating the fee for the violation. This column would be useful for keeping track of New York City parking ticket revenue. However, due to duplicated violation records, revenue calculations based on the dataset would not be accurate. These duplicate records only differ based on the value in the fee column. All other column values are shared in the duplicated records. A decision has been made to use the minimum fee to resolve the ambiguity created by these duplicates.
Identify the 3 duplicated parking_violation records and use the MIN() function to determine the fee that will be used after removing the duplicate records.
Este exercício faz parte do curso
Cleaning Data in PostgreSQL Databases
Instruções do exercício
- Return the
summons_numberand the minimumfeefor duplicated records. - Group the results by
summons_number. - Restrict the results to records having a
summons_numbercount that is greater than 1.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
SELECT
-- Include SELECT list columns
___,
___(___) AS fee
FROM
parking_violation
GROUP BY
-- Define column for GROUP BY
___
HAVING
-- Restrict to summons numbers with count greater than 1
___(___) ___ ___;