Star and snowflake schemas

1. Star and snowflake schemas

Welcome back! Let's continue building on your knowledge from earlier. In this video, we'll take a look at snowflake schemas, an extension of the star schema.

2. Star schema

Remember from earlier that a star schema consists of one or more fact tables surrounded by dimension tables.

3. Snowflake schema

The snowflake schema is like a star schema, except it allows relationships between dimensions. In the example you can see that the Lender and Property dimensions each link to other dimension tables. Note that fact tables remain the same.

4. A closer look

The biggest difference between the two styles is how they handle hierarchical data. Star dimensions tend to have all levels of a hierarchy in the same table, whereas with snowflake dimensions, hierarchy levels are explicitly broken out into multiple tables. Here you can see a Product dimension table. Imagine that each product has a name. Each product also belongs to a product subcategory, which itself belongs to a product category. In a star schema, all levels of the hierarchy, as well as their attributes, show up on the same product dimension. With a snowflake schema, each level of the hierarchy becomes its own table and we join those tables, usually with keys.

5. Comparison

When it comes to dimensional modeling theory, we strongly prefer star schemas over snowflake schemas. The key reason is that star schemas are easier for business users to understand. They don't want to worry about keys or hierarchies. The other benefit is that quite a few business intelligence tools have been optimized for the star schemas. Although dimensional modeling theory leans heavily toward star schemas, we do see snowflake schemas in some data warehouses. The reason for this is that star schemas duplicate quite a bit of data, which leads to storage costs and can impact performance. Also, star schemas are not ideal for frequently-updated data, especially with large dimensions. Suppose you have millions of rows in a dimension containing a column for country. When a country name changes, you may have to update a large number of rows with the new country name. By contrast, with a snowflake schema this would be an update of a single row.

6. Stars and snowflakes in Power BI

When it comes to Power BI in particular, it is important to note that both schemas work. If you have a snowflake schema in an existing warehouse, you could import the data as-is, making migrations easier. That said, Power BI does prefer star schemas for the same reason as dimensional modeling theory. Because it is easier for users to understand. Furthermore, there are some optimizations in Power BI which make performance much less of a concern.

7. The performance analyzer

Power BI has a built-in performance analyzer. When enabled, it keeps track of at least three key measures on each visual. The first is how long it took to read the data from its internal database and then perform any DAX operations on the data. The second component is figuring out how long it took the visual to render. Finally, a third measure captures everything else, which is typically waiting time on other operations, including waiting for cross-filtering operations to complete.

8. Performance tuning advice

There are a number of ways you can improve performance. If the DAX query takes a long time to complete, you could tune your DAX operations or improve your data loading performance. This might include improving your data model! If the big problem is in visual display, use less complicated visuals and show less information on the screen. Power BI needs to render each data point, so plotting tens of thousands of points may take a while. If the Other value is the cause of your slowness, you might want to reduce the number of visuals on the page.

9. Let's practice!

Time to practice!

This exercise is part of the course

Data Modeling in Power BI

BeginnerSkill Level

4.7+

Start Course for Free

Proper data analysis relies on proper data modeling. This first chapter covers the basic concepts and teaches you how to set up and load data from multiple sources.

Exercise 1: Data modeling and table properties Exercise 2: Power Query vs. Power BI Exercise 3: Load and transform data Exercise 4: Loading a CSV Exercise 5: Rounding, replacement, and sorting Exercise 6: Data categorization and visibility Exercise 7: Working with string columns

In this chapter, we continue comparing Power Query and Power BI practices. Later, you will exercise combining tables and column extraction on Power BI.

Exercise 1: Shaping tables Exercise 2: Power Query vs. Power BI again Exercise 3: Merging and appending queries Exercise 4: Breaking a file into multiple tables Exercise 5: Appending files Exercise 6: Column extraction

In this chapter, you'll learn more about one of the most popular approaches to data modeling. You'll get familiar with the basic building blocks of the dimensional model; facts, dimensions, and star schemas.

Exercise 1: Dimensional modeling Exercise 2: Facts vs. dimensions Exercise 3: Creating a star schema Exercise 4: Splitting data into facts and dimensions Exercise 5: Load a new dimension Exercise 6: Create another dimension

In this final chapter, you'll continue with an extension of the star schema: the snowflake schema. Lastly, you will have your first practice with performance analyzer in Power BI.

Exercise 1: Star and snowflake schemas

Current Exercise

Exercise 2: Snowflake vs. star schema Exercise 3: Evaluating performance Exercise 4: Star schemas Exercise 5: Snowflake schemas Exercise 6: The performance analyzer Exercise 7: Congratulations!