Formatting Histograms
1. Formatting Histograms
In this demo, we're going to create a Histogram in Databricks, and use it to uncover the story behind TPCH orders. A histogram helps you visualize data distribution, providing insights into key patterns. Again, in the SQL Editor, we enter a query to retrieve the total price for each order in 1995. Save this query as "2_2_demo". After running the query, we'll see the raw results, which include only one column for the total price of each order. Next, we create a histogram. Select the Visualization type as "Histogram" and choose "Total_Price" as the X column. We’ll see a preview on the right displaying the distribution of order counts across different price ranges. Each bin groups a range of values, and the height of the bars shows how many data points fall within that range. We could move our cursor to the highest bar and see it is within the price range of 57,314.51 and 113,781.81. In Databricks, we have limited control over histogram attributes, with the number of bins being the key setting we can adjust. The number of bins directly impacts how data is represented. Fewer bins group data into broad categories, simplifying the chart but possibly hiding important trends. More bins show finer details by dividing the data into smaller ranges, which can clutter the chart or emphasize noise. Finding the right balance is essential to make the data clear and insightful. Let's adjust the bins in our histogram. With 10 bins, the data is grouped into broad ranges, giving us a general view of the distribution. Now, increasing to 20 bins, we notice more detailed trends, particularly an even distribution in the price range between 29,080.86 and 198,482.76 across the second to sixth bars. This reveals previously hidden variations, helping us better understand the order data. As you continue working with histograms and other visualizations, remember that small adjustments can greatly enhance your ability to tell a compelling data story. In the exercises, let's dive into more ways to use visualizations to tell better stories with your data!2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.