Combining data in Power Query

1. Combining data in Power Query

Often times your data will be stored in several different tables. In order to analyze all of your data, you will need to combine them into one dataset. In this lesson, we will learn how to use Power Query to stitch together different queries in order to accomplish this.

2. Why combine data?

Have you ever encountered a situation where the data you wish to analyze is stored in 5 or 6 Excel sheets? This is quite a common practice in many companies as it eases the operational burden. It can also be a forced situation when there is simply too much data to be stored in one file. Luckily, Power Query has a handy feature that can combine your data as long as the structure of each file you are combining is similar. This feature is called "Append query". In other times, you will be working with data that has several tables, but there are one or more common columns in those tables. This is most often encountered when connecting to a data model with a star or snowflake schema. In some cases you will need to combine those tables together to make a "flatter" file structure. This is accomplished by using the "Merge queries" feature in Power Query.

3. Appending data

When you need to add rows to your data, you will use the append feature in power query. This can be thought of as vertically stacking the tables you wish to combine on top of each other. Using the append feature can combine all the files into one large master dataset containing sales data from all quarters. An important thing to note is that in order to combine the data successfully using append, you must make sure that the column names in the tables you are appending are exactly the same and that all tables have the same number of columns. Otherwise, the combined table will have null values, like the second illustration on this slide.

4. Merging data

The concept of merging queries will be quite familiar to you if you have used SQL joins or Vlookups before. When you perform a merge, you are choosing one or more columns from one table to compare to the columns in another table. You can link or join tables together through unique identifiers called keys.

5. Merging data: types of joins

There are different ways you can join tables. There's an inner join where only matched rows are returned. A left join returns all the matched rows and all the other rows in the left table that don't have a match. The right join is similar but with the matched rows and all the unmatched rows from the right table. And finally, there's a full outer join that returns all the matched rows and all unmatched rows in both tables.

6. Merging data: example of a left join

In the example of a left join you can see how we can connect a sales table to a product table by joining on the Product_id column. The value of $500 returns null values in the final table, because there is no information on Product_id 789 in the product table. The left join returns all rows of the left table or sales table, including the unmatched rows.

7. Merging data: example of an inner join

An inner join on the other hand only returns values of columns when there is a match in both tables. When we inner join the sales table to the product table, the final result is only two rows, because there is no match for Product_id 789.

8. Let's practice!

Let's learn how we can combine queries to analyze data from several tables on the Northwind Traders database.