Session Ready
Exercise

Sample size for a t-test

Now that we've seen the importance of sample size, let's have another look at the same athletes dataset and see if we can determine the sample size we would need to get a significant result.

Boxplots of body weights of Olympic athletes from two sports

The boxplot shows the difference in body weight between sports, using all 2830 rows from the athletes dataset. The difference between the groups looks quite small. Determine the sample size we would need to have an 80% chance of detecting a small (0.4) difference between these two samples. statsmodels.stats.power and pandas have been loaded for you as pwr and pd.

Instructions
100 XP
  • Set effect, power, and alpha to 0.4, 0.8 and 0.05, respectively.
  • Calculate the ratio using the relative lengths of the series for swimming (swimmercount) compared to athletics (athletecount) competitors.
  • Initialize the analysis, solve the equation for sample size, and print the output.