Load and transform data
1. Load and transform data
Welcome to the first demo. This course will focus on how to model your data in Power BI. Throughout the demos, you'll explore a toolbox of various techniques from which you can choose the right tools to model your data. Let's start with loading the NAICS manufacturing dataset. Select Get data and then choose Text/CSV. This csv contains several metrics of manufacturers per state per year. Selecting Transform Data brings you into Power Query. Here, you can see that there are two distinct rows on the top: one is a header with short names, one is a header that has descriptive labels. I want to keep this second row for my headers. To remove the first row, select Remove Rows and then Remove Top Rows. You are asked how many rows you want to remove, so enter "1" in this case. Now, you can take this new first row and use it as the headers. To do that, simply choose Use First Row as Headers. Note that this changed data types automatically: industry group for example is now considered an integer. If you need to change that back to text you click the data types icon, and select Text from the list of data types. You're asked whether you want to replace the current data type conversion, or if you want to add a new conversion step. Replacing it is the default option, since it minimizes the amount of data type conversion steps. Note that there is no default undo action in Power Query. If you want to undo a certain data transformation step at any point, you can do so by clicking the x mark next to a step in the list on the right. Now suppose that your users are used to seeing the Meaning of NAICS in all lowercase. You can change this in two ways: by right clicking the column, selecting Transform, or by going to the Transform menu at the top, and select Format. Either way allows you to transform this column into lowercase. With text transforms, you can also for example add a prefix to the industry group. This can only be done in the Transform menu. Choose Format, and then Add Prefix. Doing this allows you to put in some hard coded value before each value in the industry group. In this example, you can see "Code: ", followed by the original number. You can also perform some arithmetics on your numerical values. Open the right-click menu or select Scientific in the Transform menu to see the different options. Since there are quite a few orders of magnitude between values, you could choose to transform your data using the natural logarithm, for example. Double click the header to change the name of the column accordingly. Note that we also could round the numbers here in Power Query, but this would actually remove the number of decimals during the data load, rather than hiding them. To apply all your changes and return to Power BI, click Home and then Close & Apply. This will load your data and perform all the steps we just defined. OK, let's look at some operations you could perform in Power BI. Go to the Data view, and select a column to reveal the Column tools menu. In here, you have different options that only alter the visual representation of the data, like changing the formatting, currency, rounding, or data category. Recall that Subsector was a numeric column, but it doesn't make sense to sum these categories. You can change the summarization here to Don't summarize, so Power BI will infer more appropriate visualizations. It's also convenient to let Power BI know which data category a certain column belongs to, especially in the case of geographic data. For example, under Data category, you can select State or Province here which will allow Power BI to draw maps correctly. You can also change the way a column is sorted by clicking the dropdown menu. You can even sort this column by another column and sort Geographic Area Name by the Geographic ID for example. This will help creating an appropriate visualization when creating reports, with ordering that makes sense to the end user. Lastly, let's hide a column that has no use for the end user, for example this Geo Footnote, which seems to be always null. Right click the column name and select Hide in report view. Your data is now properly cleaned and loaded, ready for you to start creating visualizations! Let's get to the exercises.2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.