Bar plots
1. Bar plots
In the last video, we saw that histograms are a specialized version of bar plots, where we have binned a continuous X-axis.2. Bar Plots, with a categorical X-axis
Classic bar plots refer to a categorical X-axis. Here we need to use either geom_bar or geom_col.3. Bar Plots, with a categorical X-axis
geom_bar will count the number of cases in each category of the variable mapped to the x-axis, whereas geom_col will just plot the actual value it finds in the data set.4. Bar Plots, with a categorical X-axis
All the positions we just looked at are available in bar plots. You will encounter two types of bar plots in wide-spread use. Depicting either absolute counts or distributions. Let's take a look at them in turn.5. Habits of mammals
We'll use a data set containing information on the REM sleep time and eating habits of a variety of mammals.6. Bar plot
In this bar plot, we've split our data set according to eating behavior and simply asked how many observations we have in each category. Notice that something very similar to what happened with geom_histogram has happened here. The data was counted and that count was plotted, so once again there were some statistics which occurred under the hood, in this case there was a default value of "bin" set for the stat argument. These kind of plots are useful in getting a quick visual output, but we often see another type of bar plot, one which tries to depict the distribution of a data set. Let's consider a scenario similar to what we saw with the point geom - that we have a data set with the summary values already calculated.7. Plotting distributions instead of absolute counts
Often times this is the case - you will have descriptive statistics already calculated, but remember that we can make ggplot do this on the fly.8. Plotting distributions
If we want to plot the average sepal width for each species, we can map the avg column in our dataset onto the y aesthetic. In this case we need to use geom_col. If we want to add error bars, there is another geom for that, appropriately called geom_errorbar. Here we again need to specify some aesthetics specific to this geom, namely ymin and ymax. On top of that I've set the width of the error bar tips to be narrow and I've made the fill of the bars themselves gray, so that we can see the error bars. This is the kind of plot that you'll typically see in scientific publications, but it's pretty terrible. There is a special name in the data vis community for these types of plots - they're called dynamite plots, as in Wile E Coyote and the Roadrunner and a giant stick of Acme dynamite. They are strongly discouraged for many reasons, which we'll explore in the data vis best practices chapter at the end of the second course.9. Let's practice!
Let's head over to the exercises and take a look at bar plots in actionCreate Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.