Combining queries with Power Query
1. Combining queries with Power Query
In this video, we will show you how to combine queries with Power Query. Let’s start by opening an Excel Workbook and launching Power Query. We’re going to start by loading three CSV files. You will recall that you can load a new data source using the ‘New Source’ option under ‘New Query’ in the main Home ribbon. First, we are going to select dailycensus_7221_2020.csv and import it. This dataset includes daily statistics for a hospital facility with ID 7221 for 2020. Now, let’s load two more CSV files for hospital facilities in the years 2021 and 2022. This time, we will use the right-click shortcut under the Queries to import each. Let’s take a look at our three loaded files. We can see that each has identical column names, types, and even the order of the columns matches. These conditions make the queries ready to be stacked in the Append operation to create a 3-year span of data to be analyzed. In Power Query, the ‘Combine’ options in the Home tab of the ribbon show we can either ‘Append Queries’, which is the default, or ‘Append Queries as New’. Let’s use the default option. We can see that this opens a pop-up window that identifies the current selected query and a second query to be appended to it. Since we wish to combine three queries, select “Three or more tables” and add them to the window. Using the default append option effectively stacks the three queries together under the original query we had selected, in this case, the 2022 file. If we undo the applied step and try the other option to Append Queries as New, notice how this derives a fourth query, a combination of the selected three. Let’s rename this “dailycensus”; now we have data for all three years. Now, let’s load another dataset, climate_daily.csv, which logs daily weather conditions from the hospital's closest weather station. We can see that this dataset has a column "local date", which is in the same format as the "effective date" column from the appended query. Let’s say we wanted to merge columns from the climate daily query to the daily census query. As we saw with Append, Merge Queries defaults to merging on the selected query, where ‘Merge Queries as New’ would produce the same operation but onto a new query. Let’s select dailycensus and the default option. Power Query shows the selected query on top and allows for one query to be merged with it. To set up the merge, we must select columns to match. In this case, the “effective date” will be matched to the “local date”. Next, we select the type of Join. For our purposes, a left outer join will allow us to ensure that all daily census results from our first table will be included, so we will select this option. We can see the breakdown of options for fuzzy matching next. However, since Power Query has identified in this preview that all rows have matches, it assures us that there is good correspondence between the two queries and fuzzy matching will not be needed in this instance. Let’s click OK. We can see this new condensed column has appeared in our dailycensus query. We can preview the matched contents by clicking on the white spaces of the cells here. To expand and display the merged columns, we click expand at the top right expander button of the generated column. It gives a drop-down of all available columns. Let’s select a few and click OK. Notice how Power Query has added a prefix to these column names based on the name of their original query. If you don’t wish to include those expanded names, remove the text from the Expanded applied step by clicking the gear here. We have successfully merged four columns to our original query. Now it’s your turn!2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.