CommencerCommencer gratuitement

Resolving impartial duplicates

The parking_violation dataset has been modified to include a fee column indicating the fee for the violation. This column would be useful for keeping track of New York City parking ticket revenue. However, due to duplicated violation records, revenue calculations based on the dataset would not be accurate. These duplicate records only differ based on the value in the fee column. All other column values are shared in the duplicated records. A decision has been made to use the minimum fee to resolve the ambiguity created by these duplicates.

Identify the 3 duplicated parking_violation records and use the MIN() function to determine the fee that will be used after removing the duplicate records.

Cet exercice fait partie du cours

Cleaning Data in PostgreSQL Databases

Afficher le cours

Instructions

  • Return the summons_number and the minimum fee for duplicated records.
  • Group the results by summons_number.
  • Restrict the results to records having a summons_number count that is greater than 1.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

SELECT 
	-- Include SELECT list columns
	___, 
    ___(___) AS fee
FROM 
	parking_violation 
GROUP BY
	-- Define column for GROUP BY
	___ 
HAVING 
	-- Restrict to summons numbers with count greater than 1
	___(___) ___ ___;
Modifier et exécuter le code