1. Learn
  2. /
  3. Courses
  4. /
  5. Case Studies in Statistical Thinking

Connected

Exercise

EDA: Plot all your data

To get a graphical overview of a dataset, it is often useful to plot all of your data. In this exercise, plot all of the splits for all female swimmers in the 800 meter heats. The data are available in a NumPy arrays split_number and splits. The arrays are organized such that splits[i,j] is the split time for swimmer i for split_number[j].

Instructions

100 XP
  • Write a for loop, looping over the set of splits for each swimmer to:
    • Plot the split time versus split number. Use the linewidth=1 and color='lightgray' keyword arguments.
  • Compute the mean split times for each distance. You can do this using the np.mean() function with the axis=0 keyword argument. This tells np.mean() to compute the means over rows, which will give the mean split time for each split number.
  • Plot the mean split times (y-axis) versus split number (x-axis) using the marker='.', linewidth=3, and markersize=12 keyword arguments.
  • Label the axes and show the plot.