Session Ready
Exercise

Constructing strip plots

Regressions are useful to understand relationships between two continuous variables. Often we want to explore how the distribution of a single continuous variable is affected by a second categorical variable. Seaborn provides a variety of plot types to perform these types of comparisons between univariate distributions.

The strip plot is one way of visualizing this kind of data. It plots the distribution of variables for each category as individual datapoints. For vertical strip plots (the default), distributions of continuous values are laid out parallel to the y-axis and the distinct categories are spaced out along the x-axis.

  • For example, sns.stripplot(x='type', y='length', data=df) produces a sequence of vertical strip plots of length distributions grouped by type (assuming length is a continuous column and type is a categorical column of the DataFrame df).
  • Overlapping points can be difficult to distinguish in strip plots. The argument jitter=True helps spread out overlapping points.
  • Other matplotlib arguments can be passed to sns.stripplot(), e.g., marker, color, size, etc.
Instructions
100 XP
  • In the first row of subplots, make a strip plot showing distribution of 'hp' values grouped horizontally by 'cyl'.
  • In the second row of subplots, make a second strip plot with improved readability. In particular, you'll call sns.stripplot() again, this time adding jitter=True and decreasing the point size to 3 using the size parameter.