Get startedGet started for free

Appending and merging queries

1. Appending and merging queries

Hello, my name is Lyndsay Girard, and I will be your instructor for this intermediate Power Query in Excel course. We'll dive deeper into Power Query to help take your skills to the next level.

2. Combining datasets in Power Query

In this video, we will cover how to combine queries, and the need to do so depends largely on the type of datasets that you have. For instance, you might need to consolidate data from multiple sources into one master table. By combining data in Power Query, we can avoid the manual manipulation you may typically do in Excel Workbooks, such as using a VLOOKUP to match values between tables. Ultimately, combining queries enriches datasets by adding information from other related tables. In Power Query, we rely on the Append and Merge operations to combine queries.

3. Principles of append vs. merge

The append operation involves stacking datasets of a shared column structure vertically, like creating a stack of building blocks. By comparison, the merge operation combines datasets horizontally based on one or more common columns. These purple rectangles represent datasets of similar structure, ready to be stacked. These blue rectangles represent datasets with a common (yellow) section, ready to be combined horizontally.

4. Principles of append vs. merge

We can see here how the Append operation stacks these datasets into a unified dataset of the same structure, just extended vertically. The Merge operation adds information horizontally based on matching common contents in both.

5. Append in practice

To carry out the append operation, queries must have the same column names and data types. Let's see an example with two tables of 3 rows each. Both have the same columns, though the green table has one additional column not seen in the blue table.

6. Append in practice

Appending these two reveals a new query comprising six rows, the combined total of both. The blue table shows "null" in the column where no data exists.

7. Diving deeper into the merge operation

Merging data is relative to the current selected query. Depending on the type of join used, unmatched records may be included or excluded in the output.

8. Join types

Multiple types of joins can be utilized when merging. In this example, the main dataset is A, and the one being merged is B. Let's look at a Left Outer join, one of the most common types of joins. The Left Outer join returns all records from dataset A and only those that match from dataset B. The remaining join types follow the same principle, specifying what records will be returned from either or both datasets.

9. Merge in practice: left outer join

To successfully merge two queries, you must ensure the shared columns have the same data type. It is also useful to preview results to ensure the merging logic is accurate. Let's assess a left outer join between these two queries based on patient ID. The green table is the main (or left) table; this table will return all rows, along with any matching from the second purple table.

10. Merge in practice: left outer join

Now you can see that we have three new columns added from the purple table, which match the same PatientID from the green table.

11. Merge in practice: full outer join

Now, let's see how a full outer join would look between these two tables. As we can see, now all rows from both tables are returned, whether or not any data is associated with the main columns matched on.

12. Special join conditions

It is typical to expect that columns being merged have identical structures to be successfully joined. This is known as an exact match. However, it is also possible to perform a fuzzy match, which allows for flexible and approximate matching based on similarity.

13. Special join conditions

For example, fuzzy matching allows users to ignore cases and match text on parts of a text string. This method allows the bolded entries of the second dataset to be joined to the first without manual effort.

14. Dataset

In this course, we will utilize real-world hospital data from the Yale New Haven Health System emergency department's electronic medical records. Captured data includes patient demographics, registration information, triage assessments, medications, and other related data.

15. Let's practice!

Let's practice!