Validating data quality
Data isn't always very clean. Both the composer
field in the track
table and the name
in the artist
table contain information about a song's author. You'd like to validate the data quality in these fields for each track. To do this, you'll use a CASE
statement and the columns from JOIN
'ed tables. Go get 'em.
This exercise is part of the course
Data Manipulation in Snowflake
Exercise instructions
- When the
track.composer
field isNULL
, then label as'Track Lacks Detail'
. - If the
track.composer
field matches theartist.name
, then return'Matching Artist'
. - Finally,
LEFT JOIN
theartist
table toalbum
using the fieldartist_id
; note thattrack
andalbum
have already been joined for you.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
SELECT
track.name,
track.composer,
artist.name,
CASE
-- A 'Track Lacks Detail' if the composer field is NULL
WHEN track.composer ___ ___ THEN 'Track Lacks Detail'
-- Use the composer and artist name to determine if a match exists
___ track.composer = ___.name ___ '___'
ELSE 'Inconsistent Data'
END AS data_quality
FROM store.track AS track
LEFT JOIN store.album AS album ON track.album_id = album.album_id
-- Join the album table to artist using the artist_id field
___ JOIN store.___ AS artist ON album.___ = ___.___;