Numerical transformations
1. Numerical transformations
Now let’s explore some of the numerical transformations in Power Query. We will be working with stock price data of the S&P 500 Index. The S&P 500 is a collection of the 500 largest public companies in the United States. All the columns in this dataset are considered numerical columns, including the date column on the far left. This dataset contains the adjusted closing price for every day from the 22nd of February 2021 to the 18th of February 2022. We can verify this with the column profile feature: You can see the minimum and maximum correspond to the dates I mentioned previously. There are only 253 rows in our data, but this is to be expected since the stock market only trades on weekdays and non-holidays. The volume column contains the daily traded volume in units of stock. However, it can be a little hard to read. We can make it easier by dividing the column by one million and then renaming the column to specify that the units are millions of units of stock. We accomplish this by going to the standard transformations in the transform ribbon and clicking on “divide”. The column is much more readable now thanks to this transformation. The next column we can work to transform is the “Deviation to mean” column. This column contains the difference between each days’ closing price and the average closing price for the whole year. We can use it to calculate the variance of the S&P 500, first by squaring the column and then dividing by the number of samples minus 1. To square the column, we go to the “Scientific” transformations, then we select Power, then square. In order to divide by the number of samples minus 1, we select the divide transformation and input 252. Now we are almost at the sample variance. All that’s left is to sum the column, which we can do by using the “Sum” transformation under the “Statistics” icon. The transformations under the Statistics Icon will summarize your entire dataset into just one number, which you won’t be able to build measures with if you load into Power BI. We use them mainly for intermediate analysis in Power Query to understand our dataset more. If we take the square root of this number, we’ll get the standard deviation of the S&P 500 for the last year, which is a good measure for the average volatility of the US Stock market: We can also confirm that this result is equivalent to the statistic in the column profile feature: Let’s remove the last two steps here: Calculated Sum and Calculated Square root so we can work with the whole dataset again. When it comes to date columns, there are special transformations that can be applied. Let’s say we want to see which days of the week are trading days in the US stock market. We can use the “Name of Day” transformation located under the “Date” icon in the Transform ribbon: Note that this transformation will replace your column with text data, so it is generally best practice to duplicate your date column first then apply the transformation on the duplicated column to preserve the original dates. We can also do the same for the month names: This shows us that the month with the most trading days is March: While both Wednesday and Tuesday are tied for most trading days: There are a variety of other data transformations you can apply to derive some more insights from your time-series data. You will now apply these to clean and transform Adventure Works’ sales data.2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.