Get startedGet started for free

Univariate visualizations

1. Univariate visualizations

Let's learn to create some basic univariate visualizations in Plotly with Python.

2. What are univariate plots?

Univariate plots display only one variable. For example, just data on your height. These allow insights into the distribution of that particular variable. Some common univariate plots include Bar Charts Histograms Box Plots and Density Plots. We already plotted a bar chart in the previous video, so let's jump straight into histograms.

3. Histograms

Whilst a histogram may look like a bar chart, it has some key differences. Each column, called a bin, represents a range of values that samples could have for a particular variable. The height of each bar usually indicates the number of samples that fell within that range, though other aggregations are possible. You can choose the bins yourself or have Plotly choose the bins for you.

4. Our dataset

For these following exercises, we will use a dataset from scientific research on Penguins! It contains various body measurements from different types of penguins.

5. Histograms with plotly.express

Let's create a histogram of the penguin's body mass using plotly express. Here, we use a px dot histogram. As with the bar chart, we first set the DataFrame containing our data. Then, we specify the column to aggregate into our bins. You can optionally specify the number of bins. Here is our graph. See the hover automatically displays the bin size and count of samples inside that bin.

6. Useful histogram arguments

There are a variety of helpful plotly.express arguments that you could use to enhance your histogram plots. These allow you to set the orientation to be horizontal or vertical. And change the way you aggregate within the bins. Check the documentation for many more options.

7. Box (and whisker) plots

A box and whisker plot summarizes a variable using quartile calculations. The middle-colored area is the interquartile range. It has a top line representing the third quartile or the 75th percentile. A value where 75% of all data points fall below this. A middle line represents the median. The bottom line represents the first quartile or 25th percentile. The top and bottom lines on the box plot represent the min and max values, excluding outliers, according to a special definition. Finally, extra dots at the top and bottom are outliers.

8. Box plots with plotly.express

Let's create a box plot using plotly express to visualize flipper lengths. For this, we use a px dot box and only need to specify the DataFrame and the variable to aggregate. For a box plot, the key argument is y. This is what is produced.

9. Useful box plot arguments

Here are some other useful arguments for a plotly express box plot. Using the hover data argument, you can set other variables to appear in the hover data. You can set the points argument to assist in analyzing outliers. Check out the docs for many more arguments.

10. Let's practice!

Let's practice creating some univariate plots using Plotly in Python!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.