Get startedGet started for free

Dimensional Modeling

1. Dimensional modeling

Welcome back to our data modeling journey in Snowflake! It is the turn of dimensional modeling, focusing on dimensions and facts to transform our data warehouse insights. Let's begin.

2. Introduction to the dimensional data model

Dimensional modeling is a technique that structures data for easy analysis and reporting. It contains two major components: dimensions and facts.

3. Introduction to the dimensional data model (1)

Imagine organizing a university database. Dimensions are categories, like students and schools. The fact table details the interactions between those categories. In our case, 'facts' are the records of enrollments, as in students signing up for classes. This table could include data like enrollment dates and class names, showing how students engage with the university. Dimensions frame the context, and facts fill it with details. Unlike the precise organization of normalized data, dimensional modeling can work well with broader, less detailed information, making analysis easier. Dimensional modeling helps us see the big ideas and main actions in our data, making it easier to understand the main trends.

4. Star dimensional model schema

Dimensional modeling uses two schemas: star and snowflake. A star schema has a central fact table connected directly to dimensions, simplifying the structure. Our university model will feature a star schema with straightforward dimensions for students, classes, and schools centered around our enrollments fact table.

5. Star dimensional model schema

Let's clarify two similar concepts: the Snowflake data warehouse and the snowflake schema. Snowflake Data Warehouse is the cloud-based storage and analytical service we have used to showcase and query the data during the videos and exercises. The snowflake schema organizes data in the dimensional model, which evolves from the star schema, including sub-dimensions.

6. Snowflake dimensional model schema

A snowflake schema expands on this by breaking down dimensions into more detailed sub-dimensions. Suppose our university model expands into a snowflake schema, for example. In that case, we might add more detail to the 'class details' sub-dimension, such as 'exam schedule,' detailing each class's exam dates and times. For the 'class instructor' sub-dimension, we could include 'instructor office hours,' providing students with times for additional support.

7. Defining dimensions

To organize our model, we're labeling tables with 'dim' for dimensions and 'fact' for facts, which is handy for learning. It's optional in real-life business scenarios but helps us understanding the purpose of each table. We use ALTER TABLE and RENAME TO for renaming. 'students' becomes 'dim_students,' and the same applies to 'classes' and 'schools'.

8. Defining date dimension

We'll also introduce a dimension called 'dim_date' to keep track of critical times, such as the start of a semester or a school year. Analyzing this data allows us to identify trends and patterns, such as the semester that enrolls the highest number of students. This insight is valuable for the school's planning purposes.

9. Defining enrollments fact

And finally, the fact that the enrollments entity contains the foreign key references all the model dimensions; dim_students, dim_classes and dim_date.

10. Retrieving data from the dimensions

We begin queries with the fact table in dimensional modeling because it holds the core data and related dimensions. In our case, the fact table is enrollments. The fact table is efficient for queries as it centralizes key metrics and connects to dimensions via foreign keys.

11. Retrieving data from the dimensions (1)

Let's find the students enrolled in Science classes for 2023. To do this, we should start with fact enrollments; we directly access the enrollment records and then join dimensions using their foreign keys relationship. This method streamlines the retrieval process, making it faster to get meaningful insights for analysis.

12. Terminology and functions overview

Here's a recap of the key concepts we've used in our dimensional model.

13. Functions overview

Take note of the implementations of the scripts as well.

14. Let's practice!

It is time to practice! Let's transform our entities into a star schema dimensional model.