Get startedGet started for free

Data preview features in Power Query

1. Data preview features in Power Query

In this video, we will learn a useful feature of Power Query that can help you quickly summarize key characteristics of your dataset. We'll use this feature to help us identify and fix common errors that might appear in our data.

2. What is data preview?

Power Query, as we've mentioned previously, is a data preparation tool that helps us transform our data into a state that is more suitable for analysis. To do this, we should have a way of quickly investigating our dataset to allow us to diagnose what transformations are needed. This is where the data preview features of Power Query come in. To access the data preview features, you simply enable them in the view ribbon of Power Query. There are 3 types of data previews: Column Distribution, Column Quality, and Column Profile. We will learn the exact nature of these features as well as when to use them in the following slides. An important thing to note when working with large datasets is that, by default, data previews will only analyze the data in the first 1000 rows of your dataset. However, you can change that option by clicking on the "Column Profiling based on the top 1000" rows box present at the bottom-left of the Power Query editor.

3. Why use data preview?

The main reason to use data preview features is to help you find errors and inconsistencies in your dataset. As you apply your transformations step-by-step you can also see how the characteristics of your dataset change, so it is also a great way to see if the transformation you applied had the desired effect. Finally, they are a powerful tool to quickly summarize the data in your columns and help you take decisions about what transformations to apply next.

4. How to use data preview?

The various data preview features are all unique in their own way, so lets see what they are used for. Let's start with "Column Distribution". This allows us to preview the distribution of our columns by displaying a small histogram under each column name. This feature is particularly useful for when you are checking a column for duplicate values. It also shows us the count of unique and distinct values in this column, to see how many duplicates, if any, exist. The next feature is called "Column Quality". When you need to check if your data contains any missing values or error values, this is the feature to use. It will calculate the percentage for each of these and show them underneath all column names. Finally, we have the "Column Profile" feature, which combines aspects from both previous features. The key difference is that column profiling only works for one column at a time. This enables it to display much more detailed statistics, such as the average, minimum, maximum, and standard deviation. You can also find out the distribution and number of unique and distinct values for that column. This feature only works when you select a column.

5. Additional uses for data preview

Since data previews are most often used when you are looking for errors in your dataset, they also come with quick access to some of the most common transformations that you will use to deal with those errors. To view these transformations, simply hover over the output of the relevant data preview feature you are using and a small tooltip will appear. There you will find a suggested transformation marked with the "lightbulb" icon. You can also click on the 3 dots to expand a menu containing additional useful transformations.

6. Let's practice!

Now, let's put these features to good use by cleaning up our dataset.