1. Power Query Editor
Let’s load another CSV file called DimCustomers, which contains information about AdventureWorks customers who have made purchases.
In the past, we’ve gone straight to load. But in the preview, we can see something weird happening in the first rows. Let’s transform the data instead.
This is Power Query Editor, which allows us to transform the data before loading.
To remove this top row holding unwanted information, we can use the Remove Rows action above. It has several options, but we’ll use Remove Top Rows. We enter one because we only want to remove the first row from our dataset.
Note that the section here, Applied Steps, tracks our changes. I can go to an earlier version to see the deleted row. We can see that Power BI has already automatically applied these two steps when loading the data.
Looking at our column names, we still need to do some work. Incorrect header names are a common problem, and Power BI has an action called Use First Row as Headers, which is available to help solve this. Again, we see this change in the Applied Steps panel.
Looking through the data, there are a couple of columns we could delete, like Valid From and Valid To. We can select them and press Remove Columns.
Most importantly, we need to Close and Apply to save our changes. Now our changes are being applied.
In the Model view, we can see that Power BI automatically recognized a relationship with our new table, so there’s nothing to do here.
Back in the report view, we can pull up the Power Query Editor again by selecting Edit Query from the menu that appears when clicking on ellipsis next to a table. We can edit our changes; for example, we can undo the removed columns if we change our minds!
Again, we close and apply.
Let’s add a new page at the bottom.
Here, let’s quickly visualize the color of the products sold and their unit price using a clustered bar chart. Is it possible to tell if there is a relationship between color and price?
We can’t tell right now because our graph shows the sum of all the unit prices, which would take into account the number of each item sold. This is because Power BI automatically sums numerical fields in a visualization.
If we go to unit price, we can change the aggregation from sum to average. In general, there are many other possibilities. We could take the minimum, maximum, count, or any other aggregation function listed here.
On average, red products cost the most, but we would have to do some more digging to be sure! It's time for you to try out Power Query Editor.
2. Let's practice!