Get startedGet started for free

Applying functions for string cleaning

Throughout this course, we will be using a dataset with 5000 New York City parking violation records stored in the parking_violation table.

A service to provide parking violation recipients with a hard copy of the violation is being re-designed. For proper formatting of the output of the information on the report, some fields needs to be changed from the database representation. The changes are as follows:

  • For proper text alignment on the form, violation_location values must be 4 characters in length.
  • All P-U (pick-up truck) values in the vehicle_body_type column should use a general TRK value.
  • Only the first letter in each word in the street_name column should be capitalized.

The LPAD(), REPLACE(), and INITCAP() functions will be used to effect these changes.

This exercise is part of the course

Cleaning Data in PostgreSQL Databases

View Course

Exercise instructions

  • Add '0' to the beginning of any violation_location that is less than 4 digits in length using the LPAD() function.
  • Replace 'P-U' with 'TRK' in values within the vehicle_body_type column using the REPLACE() function.
  • Ensure that only the first letter of words in the street_name column are capitalized using the INITCAP() function.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

SELECT
  -- Add 0s to ensure violation_location is 4 characters in length
  ___(___, ___, ___) AS violation_location,
  -- Replace 'P-U' with 'TRK' in vehicle_body_type column
  ___(___, ___, ___) AS vehicle_body_type,
  -- Ensure only first letter capitalized in street_name
  ___(___) AS ___
FROM
  parking_violation;
Edit and Run Code