Session Ready
Exercise

Hexagonal binning of bivariate data

Scatter plots show the relationship between two continuous variables by plotting data values as points in a Cartesian coordinate system. However, for datasets with a large number of observations, these points may overlap with each other and consequently obscure important patterns in the data. One common approach is to plot some form of bivariate density estimate instead of the raw data, as is done with histograms and kernel density plots for univariate data.

You have already used panel.smoothScatter() to produce bivariate kernel density plots. A different graphical design that is analogous to histograms uses hexagonal binning of the plane, using color or radius to indicate the count in each bin.

Hexagonal binning and plotting is implemented in the R package hexbin, which also includes the high-level function hexbinplot() for creating conditional hexbin plots using the lattice framework. Your task for this exercise is to use hexbinplot() to create a plot of death rates among males and females in the USCancerRates dataset.

The formula and data argument in a hexbinplot() call is interpreted in the same way as xyplot(). You will also use the following optional arguments:

  • The type argument can be set to "r" to add a regression line.

  • The trans argument can be a function that is applied to the observed counts before creating bands for different colors. By default, the range of counts is divided up evenly into bands, but taking the square root of the counts, for example, emphasizes differences in the lower range of counts more.

  • The inv argument gives the inverse function of trans, so that transformed counts can be converted back before being shown in the legend.

Instructions
100 XP

The USCancerRates dataset is pre-loaded.

  • Load the hexbin package.

  • Draw a hexbin plot of rate.female on the y-axis against rate.male on the x-axis.

  • Use the type argument to add a regression line.

The trans and inv functions are pre-scpecified, so that count bands are equispaced after the square-root transformation.