Applying functions for string cleaning
Throughout this course, we will be using a dataset with 5000 New York City parking violation records stored in the parking_violation table.
A service to provide parking violation recipients with a hard copy of the violation is being re-designed. For proper formatting of the output of the information on the report, some fields needs to be changed from the database representation. The changes are as follows:
- For proper text alignment on the form,
violation_locationvalues must be 4 characters in length. - All
P-U(pick-up truck) values in thevehicle_body_typecolumn should use a generalTRKvalue. - Only the first letter in each word in the
street_namecolumn should be capitalized.
The LPAD(), REPLACE(), and INITCAP() functions will be used to effect these changes.
Cet exercice fait partie du cours
Cleaning Data in PostgreSQL Databases
Instructions
- Add
'0'to the beginning of anyviolation_locationthat is less than 4 digits in length using theLPAD()function. - Replace
'P-U'with'TRK'in values within thevehicle_body_typecolumn using theREPLACE()function. - Ensure that only the first letter of words in the
street_namecolumn are capitalized using theINITCAP()function.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
SELECT
-- Add 0s to ensure violation_location is 4 characters in length
___(___, ___, ___) AS violation_location,
-- Replace 'P-U' with 'TRK' in vehicle_body_type column
___(___, ___, ___) AS vehicle_body_type,
-- Ensure only first letter capitalized in street_name
___(___) AS ___
FROM
parking_violation;