Get startedGet started for free

Shaping and Grouping Data in Excel Power Query

1. Shaping and Grouping Data in Excel Power Query

Let’s get started with some data transformations. We will use queries already loaded and will take advantage of features allowing us to reference or duplicate them. Right-clicking on a query name reveals several options, including “Duplicate” and “Reference”. In brief, “Duplicate” creates an entirely separate copy of the original query. Let’s call this one dailycensus duplicate. We can see that it has copied all of our applied steps exactly as they existed in the original. Now, say we needed to transpose this table to meet the requested report specifications, where we must swap rows for columns so each date renders a unique column. Under Transform options, selecting Transpose lets us do just that. In this step, all columns have become rows, and each date is represented in a new column. We can clean this up by promoting the first row to headers. Next, we have been tasked with preparing a specific summarization from the drug_administrations dataset. Let’s use “Duplicate” again and document it accordingly. We want to derive the average administered dose per medication name by patient ID. Since this requires both a column-row swap and an aggregation, we will want to use the Pivot transformation. Let’s start by isolating our three columns of interest by holding CTRL (or Command in Mac OS) and selecting each of them. We will then right-click to reveal the ‘remove other columns’ option. To achieve our objective, we want to pivot medication_name so each medication will be represented in its own column. Selecting this column and then Pivot under the transform tab gives us a pop-up window of column Pivot options. Let’s set values to dose_administered. Since we want to calculate the average dose, we need to specify it here as well; otherwise, it will default to SUM. Then we just click “OK”. Our output is a summary table for all unique patients in the dataset with their average administered dose by medication name. We will see nulls wherever there was a medication that any particular patient did not receive. Now, let’s create another query from drug_administrations, this time using “reference”. Notice how instead of all the applied steps showing here, it’s just showing Source. This is because when we reference a query, it essentially synchronizes to all the steps in the original query that it is referencing rather than duplicating every applied step. Just need to be careful here because if you do change any steps in the original query, it will also change them in any query that is referencing it! Okay, let’s see what we can do with the Group By operation. Once more, we will assess the average administered dose by medication name for each patient ID. The difference is that we want to keep medication names in rows instead of swapping to columns, as we did with Pivot. Start by CTRL selecting patient_id and medication_name since these are the unique rows we want to keep and since each patient can receive multiple medications and for multiple days each. Under the transform tab, clicking “Group By” reveals the option selection menu, where we specify our aggregation of interest. In this case, we want to calculate the average administered dose, which requires the Average operation and is based on the dose_administered column. Click OK, and let’s see how our Group-By transformed data appears. Let’s do a quick sort by patient_id to have a better look. Now, patient_id and medication_name remain in their own columns, and we have a nice aggregation of dose_administered, showing the calculated average for each. Before we wrap up, let’s open up the Query Dependencies viewer to visually see how our queries reference each other (or not) based on whether we reference or duplicate an existing query. Notice how a query that was duplicated exists independently from the original query, whereas referenced queries are derived and, therefore, dependent on the query that they originated from. Okay, now it’s your turn!

2. Let's practice!