Session Ready
Exercise

Constructing swarm plots

As you have seen, a strip plot can be visually crowded even with jitter applied and smaller point sizes. An alternative is provided by the swarm plot (sns.swarmplot()), which is very similar but spreads out the points to avoid overlap and provides a better visual overview of the data.

  • The syntax for sns.swarmplot() is similar to that of sns.stripplot(), e.g., sns.swarmplot(x='type', y='length', data=df).
  • The orientation for the continuous variable in the strip/swarm plot can be inferred from the choice of the columns x and y from the DataFrame data. The orientation can be set explicitly using orient='h' (horizontal) or orient='v' (vertical).
  • Another grouping can be added in using the hue keyword. For instance, using sns.swarmplot(x='type', y='length', data=df, hue='build year') makes a swarm plot from the DataFrame df with the 'length' column values spread out vertically, horizontally grouped by the column 'type' and each point colored by the categorical column 'build year'.

In this exercise, you'll use the auto DataFrame again to illustrate the use of sns.swarmplot() with grouping by hue and with explicit specification of the orientation using the keyword orient.

Instructions
100 XP
  • In the first row of subplots, make a swarm plot showing distribution of 'hp' values grouped horizontally by 'cyl'.

  • In the second row of subplots, make a second swarm plot with horizontal orientation (i.e. grouped vertically by 'cyl' with 'hp' value spread out horizontally).

    • In addition to reversing the columns for the x and y parameters, you will need to specify the orient parameter to explicitly set the horizontal orientation.
    • Color the points by 'origin' (refer to the text above if you don't know how to do this).