Analyzing incomplete records
In an effort to reduce the number of missing vehicle_body_type
values going forward, your team has decided to embark on a campaign to educate issuing agencies on the need for complete data. However, each campaign will be customized for individual agencies.
In this exercise, your goal is to use the current missing data values to prioritize these campaigns. You will write a query which outputs the issuing agencies along with the number of records attributable to that agency with a NULL
vehicle_body_type
. These records will be listed in descending order to determine the order in which education campaigns should be developed.
This exercise is part of the course
Cleaning Data in PostgreSQL Databases
Exercise instructions
- Specify two columns for the query results:
issuing_agency
andnum_missing
(the number of missing vehicle body types for the issuing agency). - Restrict the results such that only
NULL
values forvehicle_body_type
are counted. - Group the results by
issuing_agency
. - Order the results by
num_missing
in descending order.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
SELECT
-- Define the SELECT list: issuing_agency and num_missing
___,
COUNT(*) AS ___
FROM
parking_violation
WHERE
-- Restrict the results to NULL vehicle_body_type values
___
-- Group results by issuing_agency
___
-- Order results by num_missing in descending order
___;