EDA in Tableau: histograms
1. EDA in Tableau: histograms
Let's now switch to looking at the structure of a continuous variable by plotting a histogram. Just like a bar plot, you plot a histogram by creating bars that contain the counts of the number of observations. In the case of a continuous variable, that means you first convert it to categories, or bins, with a bin size of your choice. Creating bins in Tableau is straightforward: you right-click the continuous measure of interest, then Create, then Bins. Tableau suggests a bin size based on the number of observations and spread of your data, but it is recommended to experiment with it. We can take a value of one here to begin, since it makes sense to look at the amount of orders where one item was ordered, two items were ordered, and so on. This newly created bin field can be placed on the columns shelf. Then, you drag the same measure of interest, in this example 'Quantity', to the Rows shelf, changing the viz to a bar plot. Instead of the default SUM, you change it to Count, since you want to count the amount of orders per quantity bin. It looks like people mostly ordered two or three items simultaneously per order. By default, bins are treated as discrete fields. This means that only the starting value of each bin is displayed. You might want to change that, especially when you are working with larger bin sizes. To do this, right-click the bin field on the Columns shelf, and select Continuous. Now you see more clearly what Quantity amounts are grouped in each bin. This is the main strength of histograms; you can split your continuous variable in any bin size you want. There is also a way of creating histograms in Tableau with just three clicks: first, you select the numerical measure of interest, then you click the 'Show Me' button, then the histogram view. Behind the scenes, Tableau repeated the steps we did manually: creating bins, aggregating by count and displaying it as a bar plot. Remember that a very low bin size can make your data look noisy, and that too large bin sizes remove detail. It is however cumbersome to edit the bin size manually each time to see what the histogram looks like. A useful trick for EDA with histograms in Tableau, is setting a bin size using a Parameter. A Tableau parameter is a variable that is used in calculations or filters, that can easily be modified to see its effect. By choosing 'Create a New Parameter' instead of a constant value, you'll be able to specify a range of values in which the bin size should be tested. This will create a parameter field at the bottom left, which you can show on the viz and use it as a slider. Cycling through the bin size parameter now gives you an instant idea of how the histogram changes with different bin widths. Time to try it yourself!2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.