Tableau: adding lines and distribution bands
1. Tableau: adding lines and distribution bands
Tableau makes it straightforward to add measures of spread or confidence intervals to your visualizations of a distribution. Let's take for example the sum of profits, aggregated per month, and plot them against a continuous axis. Each point is now the total profit of all orders per month. Now, we switch to the 'Analytics' pane. We didn't touch it yet, but it holds some powerful tools that we will discuss in the remainder of this course. It works with the principle of so-called drag and drop analytics, which you'll be using much more often from now on. There are three main things you can add when you're plotting the distribution of a continuous variable: a reference line, a reference band, and a distribution band. A reference line allows you to add a visual reference of a summary statistic, such as the average. You can choose to add the name of the summary statistic on the line, show the actual value, or make a custom label. When selecting the average or median, you also have the option to show the confidence interval with a confidence level of your choice. These two options on the left are essentially shortcuts to a reference line with a 95 percent confidence interval. In this case, it doesn't make sense to calculate a confidence interval, since our data is considered as the whole population, and we know the true average and median. For demonstrating purposes however, we can simulate sampling by randomly selecting individual data points. Notice how the confidence interval adapts dynamically and becomes smaller when selecting more observations. To remove a reference line, right click on a part of the line or confidence interval, and click 'Remove'. Alternatively, you can right click on the axis, and select 'Remove Reference Line'. A reference band is essentially a colored area between two summary statistics. You can for example color the area between the highest and lowest observation, or the difference between the average and the median. Lastly, there is the option to add a distribution band. You use a distribution band when you want to distribute your data into percentages, percentiles, quantiles, or want to show the standard deviation. In the latter, you can further define the number of standard deviations you want to show, and whether your data needs to be considered as a sample or the population. There is also the option to create box plots directly from the analytics pane, as an alternative to using the Show me button, but with more options to customize. You may have noticed that a window pops up when I'm dragging the line or band of interest to the graph. You can see that you get to choose between 'Table', 'Pane', and 'Cell'. The difference is only noticeable when you're adding more segmentation on your viz. If I split up the data by order year and product segment, I will now have the option to add lines and bands on different levels. Let's add for example a default average reference line. 'Table' means that you're adding an average line that spans all observations of your table. The average is consequently calculated over all values that are displayed. 'Pane' refers to the first level of segmentation, in this case 'Year', and the average is now calculated for each year. If we were to switch 'Segment' and 'Year', the average is calculated per 'Segment' instead of per year. Lastly, there is the option to calculate the average by the most segmented value, referred to as 'Cell'. This is the most detailed level when you're adding lines and bands. OK, lots of new cool features to play with. You will get the chance in the exercises!2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.