Session Ready
Exercise

Density plots

Histograms are used to estimate the probability density function of the underlying distribution of the observed data. One big problem with histograms is that they look very different depending upon the bin width, so you have to experiment with different widths. An alternative approach for the same problem is to use a kernel density estimate. These are more computationally intensive, but give more natural estimates.

The lattice function densityplot() creates kernel density plots. Its formula interface is similar to that of histogram(); the formula should be written as ~ x to plot the values of the x column along the x-axis, and the estimated density on the y-axis.

A useful optional argument for densityplot() is plot.points, which can take values

  • TRUE, the default, to plot the data points along the x-axis in addition to the density;

  • FALSE to suppress plotting the data points, and

  • "jitter", to plot the points along the y-axis but with some random jittering in the y-direction so that overlapping points are easier to see.

Instructions
100 XP
  • Use the densityplot() function to create a kernel density estimate of the distribution of ozone concentration in the airquality dataset.

  • Use the plot.points argument to show the ozone values with jittering.