1. Identifying performance problems
Welcome back! In this last lesson, we'll discuss ways to identify and solve performance issues.
2. Resolving performance problems
Power BI has the ability to manage huge databases, and is usually pretty fast. But sometimes your end user could be waiting minutes for the visuals to load. It's important to know the good practices that help lighten the work carried out by the machine to make your reports as efficient as possible. The first step in handling user-reported slowness issues is identifying where the slowness presents.
3. Performance problems
There are a few places where things can go wrong. You might see slowness when importing data or querying a database via DirectQuery. Rendering the visuals could take longer than it should. Performance can also be impacted by the use of calculated versus computed columns. Lastly, inefficient relationships can cause slow performance.
There are other performance issues, but we won't focus on those in this course.
4. Optimizing data import
There are multiple things you can do to optimize data import.
The biggest one is to remove unnecessary rows and columns. If you have data which you know will not be used, it's best to delete it.
Choosing the correct data type is also important. Storing numeric data as numbers takes up less space than storing it as strings. Additionally, having to cast the data back to numeric later on can slow down performance.
Grouping and summarizing data can also improve performance when importing data because it means less data stored on disk and less work to get to aggregate results.
5. Optimizing Direct Query
Throughout this course, we have used the Import model to bring in data from CSV or Excel files. When working with databases, however, there is a second model: DirectQuery. Import stores all of the data in Power BI, whereas DirectQuery allows the Power BI to query the remote data source when needed.
When working with DirectQuery, the most important thing to keep in mind is that you want the database to be as fast as possible. One piece of advice is to limit the number of queries running in parallel.
One of the most common data sources for Power BI is a relational database management system. When using DirectQuery there, be sure to write efficient SQL queries and use appropriate indexes. The goal is to obtain from the database exactly the data necessary to solve a problem and no more.
6. Calculated versus computed columns
Calculated columns are written in DAX, whereas computed columns come from Power Query, written in the M language. Both allow you to build custom columns from data, but they can differ in terms of performance.
In general, computed columns are only slightly faster than simple calculated columns. However, computed columns can be considerably faster than complicated DAX queries when working with large datasets.
This is because computed columns are generated at import time, meaning that they get built a single time. By contrast, calculated columns are generated per visual at runtime.
7. Removing bi-directional filtering using filter measures
Earlier, we looked at a use cases of bi-directional filtering; finding relevant slicer entries between dimensions.
Power BI offers an alternative since this type of relationship can be inefficient. We can create something called filter measures to avoid bi-directional relationships.
8. Removing bi-directional filtering using filter measures
Let's look at an example. We have two dimensions feeding into the summary statistics fact: Year and Geography. We want to display only the year and geography values which have appropriate values in the fact table. The filter measure, called Slicer_MyFactTable, will do that. Let's see how this works.
9. Removing bi-directional filtering using filter measures
We want to return the number 1 if conditions are met such that we have data in the fact table for a given value in a slicer.
To do this, we first create a simple DAX measure, which uses the ISEMPTY() function to determine if there are any values in the fact table. We want to get only cases in which the fact table is not empty for a given filter and then convert that to an integer using the INT() function. This will return 1 when we have at least one value and 0 when there are no values.
From there, we can add a visual filter to any dimensional slicer and show items only when the filter measure returns 1.
10. Displaying visuals
There are a few more tips which can help optimize performance for displaying visuals. First, use restrictive filters to minimize the number of data points which appear on visuals. Furthermore, limit the number of visuals which you see on a single page, as each of these visuals needs to load each time. Also, avoid slow custom visuals. Custom visual authors do the best they can to release high-quality code, but these visuals might not always be speedy.
11. Let's practice!
That's it! Make sure to keep these tips in mind if you're ever faced with slow reports. Time for practice!