Get startedGet started for free

Interpreting visualizations

1. Interpreting a kernel density, box plots & radar charts

This is the last section of chapter 3.

2. More visualizations

In this section you make 4 visuals. First is a kernel density plot. These are good for understanding distributions. In fact, kernel density plots are like histograms, but you don't have to worry about picking a suitable bin size. Second, you examine a box plot to compare multiple sentiment distributions at once. Third, you will use the javascript radarchart library to create a visual similar to Plutchik’s Wheel of Emotion. You will end this chapter by creating a treemap visualizing multiple dimensions simultaneously.

3. Kernel density plots vs histogram

A kernel density plot is used for understanding a values’ distribution. You may be thinking to yourself…don’t we have histograms for that? Why bother? Well a kernel density plot can be thought of as a smoothed histogram. Of course histograms are great but they are biased since values have to be binned into groups.

4. Kernel density plots vs histogram

Consider this distribution. A histogram with 10 bins looks right but a histogram with few bins starts to distort what you know about the overall distribution. In contrast the kernel density plot doesn’t have bins. This plot estimates the probability density function which sounds intimidating but is straightforward. The total area under the curve is 1. The probability of a value being between x1 and x2 is the area under the curve between those two points. Since it’s a continuous probability instead of discrete values, the curve is smoothed not binned. As a result the visual isn’t biased in this way.

5. Box plot

A challenge with density plots is that examining multiple distributions simultaneously can be difficult. Still another way to examine a distribution is with a box plot, sometimes called a box and whisker plot. Comparing multiple distributions can be easier with boxplots because they are compact. An easy way to compare multiple distributions is with a boxplot. Think of a box plot as a normal distribution sideways. The dark line in the middle represents the distribution’s median value. The box extends to either side and ends at the next quartile above and below the median. Thus 50% of the data is captured within the box itself. Another quartile to each direction extends from the box and is represented by lines or whiskers. In many box plots any dots after the whiskers represent outlier values meaning they are not within the quartiles.

6. Radar Wheel of Emotion

Looking back to Plutchik’s emotional framework you can emulate something similar with a radarchart. A radarchart, sometimes called a spider chart, lets the audience compare multiple values succinctly. In this case, Plutchik has 8 primary emotions. Making a bar chart with 8 different bars could become cluttered, especially if you were comparing 2 documents…that would mean you have 16 bars! Instead in a radar chart you have a single line running around 8 axes. In this exercise you are comparing Huck Finn and Moby Dick across the 8 primary emotions. With a radar chart, you will be able to quickly know which one has more trust or anger words.

7. Treemaps

Lastly let's close with a treemap. In a treemap is the audience can consume multiple dimensions easily. There are 3 dimensions in a treemap that are adjustable. One is size of the data point. I like to use the number of words in a document to denote size. This helps understand the authors’ effort, the more words the larger the effort. Another dimension, color, demonstrates the polarity. Lastly the boxes of the treemap are grouped into a third dimension. Suppose you have a collection of facebook posts, tweets and pinterest descriptions. A treemap can examine all these text sources while retaining the group. A long negative facebook posts would be in the facebook grouping, be larger than other boxes in the group and be red since it is negative.

8. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.