Masking identifying information with regular expressions
Regular expressions can also be used to replace patterns in strings using REGEXP_REPLACE()
. The function is similar to the REPLACE()
function. Its signature is REGEXP_REPLACE(source, pattern, replace, flags)
.
pattern
is the string pattern to match in thesource
string.replace
is the replacement string to use in place of the pattern.flags
is an optional string used to control matching.
For example, REGEXP_REPLACE(xyz, '\d', '_', 'g')
would replace any digit character (\d
) in the column xyz
with an underscore (_
). The g
("global") flag ensures every match is replaced.
To protect parking violation recipients' privacy in a new web report, all letters in the plate_id
column must be replaced with a dash (-
) to mask the true license plate number.
Este exercício faz parte do curso
Cleaning Data in PostgreSQL Databases
Instruções do exercício
- Use
REGEXP_REPLACE()
to replace all uppercase letters (A
toZ
) in theplate_id
column with a dash character (-
) so that masked license plate numbers can be used in the report.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
SELECT
summons_number,
-- Replace uppercase letters in plate_id with dash
___(___, ___, ___, 'g')
FROM
parking_violation;