Tableau: summary cards and spread
1. Tableau: summary cards and spread
Tableau has a nice feature to show the statistics you've seen so far, called the summary card. Let's look at the Profit data from the Superstore dataset, only keep the furniture category, and split by sub-category. Let's add a filter for sub-category as well, and make it drop-down. Now disaggregate the measures to show the distribution of all observations, and make the plot fit the width of the screen. Clicking on the worksheet menu reveals some options to show on this particular worksheet. Show Summary will add a summary card on the right. You can drag this underneath the other cards if you want to preserve more screen estate. By default, the summary card gives you the count, sum, average, minimum, maximum, and median values for the data on the worksheet. It is dynamic, meaning that if you add or remove filters for example, the data on the summary card will update accordingly. The summary card will always summarize combining what's on the canvas, so if your data is split up by sub-category for example, the summary card won't create a summary for each sub-category. Let's focus on the chairs for a moment. Clicking on the dropdown menu from the summary card, you can add more summary statistics: standard deviation (which will be the sample standard deviation by default), first and third quartile, and finally a quantitative measure of skewness and excess kurtosis. The values of skewness and kurtosis allow you to estimate the normality of the distribution; a normal distribution would have a skewness and excess kurtosis of zero. In this case, both the skewness and kurtosis are positive. Positive skewness means that the distribution of the data is right skewed, positive kurtosis means the data is leptokurtic; so it has many outliers. Combining both skewness and kurtosis tells you that in this case, the chairs profit data has many extreme values at the higher side of the profits. Negative skewness and negative kurtosis would mean that there were no outliers, and that the distribution would be left skewed. We can quickly add a box plot to confirm this. The box plot statistics in the tooltip match those of the summary card, and you see the same right skewed and leptokurtic patterns. What about variance? Variance isn't displayed on the summary card, but can be calculated as an aggregate. When you open a new worksheet, measures will be aggregated again, but that is what we need for calculating the variance. Let's use the same category filter for profit, and split by sub-category. Convert it to a table, and show the variance, instead of the sum. Furnishings has the lowest variance. When you compare that to the box plots, you can see that the furnishings profits indeed have the lowest spread around the mean. You can aggregate using the sample variance or population variance. Since we have almost ten thousand observations, the difference between the calculations of sample vs population will be negligible. You'll dive deeper on this during the exercises.2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.