The First Norm
1. The First Norm
Hello again. We will now focus on the data normalization process.2. The process of data normalization
Data normalization is the process of organizing data in a relational model to reduce redundancy and prevent anomalies during data insertion or updates. Normalization is a multi-step process that ensures data integrity across entity relationships.3. The process of data normalization(1)
Normalization involves three forms: the first normal form, or 1NF, targets eliminating multiple values in data.4. The process of data normalization (2)
The second normal form, or 2NF, focuses on removing functional dependencies.5. The process of data normalization (3)
And the third normal form, or 3NF, aims to eliminate transitive dependencies. We will detail each of these normal forms in the following videos. Let's start with the first normal form.6. The first normal form
The first normal form, or 1NF, ensures each table attribute contains unique, indivisible values, eliminating repeating groups. Take our allproducts entity; its category attribute has comma-separated values, this violates the first normal form because these aren't single categories.7. The first normal form (1)
In our current unnormalized 'allproducts' entity, updating a single 'L' category proves challenging since it's not isolated.8. The first normal form (2)
Querying the category attribute reveals multiple comma-separated values in single rows, indicating non-compliance with the first normal form. Our task is to separate these into distinct rows for each value.9. Snowflake functions for 1NF
To address category values that are not single in their cells, Snowflake has handy functions to rescue us: TRIM, LATERAL, FLATTEN, and SPLIT.10. Snowflake functions for 1NF
TRIM cleans the data by removing empty spaces from the start and end of values.11. Snowflake functions for 1NF
LATERAL, combined with FLATTEN, treats complex lists as tables, separating items into individual rows using SPLIT on a delimiter, which, here, is commas. f.value results from this process, where 'f' is the alias for our flattened data, and '.value' is each item. And finally, we specify 'allproducts'.category with LATERAL to ensure that Snowflake applies these functions to this specific attribute.12. Applying 1NF
To achieve the first normal form, we'll create a distinct categories entity using CREATE OR REPLACE TABLE. This new entity will ensure each category is alone per row. It includes two attributes: category_id as the primary key and category as the unique value. This setup marks the way for future relationships in our data model.13. Applying 1NF
Now, let's populate our new categories entity! We must split and insert the comma-separated values from the category attribute into the categories entity. First, we use INSERT INTO, a SQL command, to insert new rows into an entity. Specify the entity, categories, and the attributes to be populated: category_id and category.14. Applying 1NF
We'll then SELECT FROM the all products entity, extracting the unnormalized data.15. Applying 1NF
We use Snowflake's LATERAL FLATTEN INPUT SPLIT and TRIM functions to separate the category values and avoid anomalies.16. Applying 1NF
For assigning unique identifiers to category_id, we use ROW NUMBER. This function creates a sequential number to each trimmed value. This step is crucial for establishing a unique primary key.17. Applying 1NF
Finally, we use GROUP BY on the trimmed values to ensure uniqueness. We've successfully set up our new 'categories' entity with this. To align the entire model to the first normal form, we need to link these categories back to the 'allproducts', creating a relationship between them.18. Towards data normalization
The categories entity now reflects one step, the first normal form, of the multi-step data normalization process. We broke down the category attribute from multiple values locked in one attribute into individual values.19. Terminology and functions overview
Let's summarize the new functions we have learned in this cheatsheet.20. Functions overview
Take note of the implementations of the functions as well.21. Let's practice!
Ready to practice the first normal form? Let's go!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.