Get startedGet started for free

Resolving impartial duplicates

The parking_violation dataset has been modified to include a fee column indicating the fee for the violation. This column would be useful for keeping track of New York City parking ticket revenue. However, due to duplicated violation records, revenue calculations based on the dataset would not be accurate. These duplicate records only differ based on the value in the fee column. All other column values are shared in the duplicated records. A decision has been made to use the minimum fee to resolve the ambiguity created by these duplicates.

Identify the 3 duplicated parking_violation records and use the MIN() function to determine the fee that will be used after removing the duplicate records.

This exercise is part of the course

Cleaning Data in PostgreSQL Databases

View Course

Exercise instructions

  • Return the summons_number and the minimum fee for duplicated records.
  • Group the results by summons_number.
  • Restrict the results to records having a summons_number count that is greater than 1.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

SELECT 
	-- Include SELECT list columns
	___, 
    ___(___) AS fee
FROM 
	parking_violation 
GROUP BY
	-- Define column for GROUP BY
	___ 
HAVING 
	-- Restrict to summons numbers with count greater than 1
	___(___) ___ ___;
Edit and Run Code