Session Ready
Exercise

Useful options for the boxplot() function

The boxplot() function shows how the distribution of a numerical variable y differs across the unique levels of a second variable, x. To be effective, this second variable should not have too many unique levels (e.g., 10 or fewer is good; many more than this makes the plot difficult to interpret).

The boxplot() function also has a number of optional parameters and this exercise asks you to use three of them to obtain a more informative plot:

  • varwidth allows for variable-width boxplots that show the different sizes of the data subsets.
  • log allows for log-transformed y-values.
  • las allows for more readable axis labels.

This exercise also illustrates the use of the formula interface: y ~ x indicates that we want a boxplot of the y variable across the different levels of the x variable. See boxplot() for more details.

Instructions
100 XP
  • Using the formula interface, create a boxplot showing the distribution of numerical crim values over the different distinct rad values from the Boston data frame. Use the varwidth parameter to obtain variable-width boxplots, specify a log-transformed y-axis, and set the las parameter equal to 1 to obtain horizontal labels for both the x- and y-axes.
  • Use the title() function to add the title "Crime rate vs. radial highway index".