Exercise

# Sample size for a t-test

Now that we've seen the importance of sample size, let's have another look at the same `athletes`

dataset and see if we can determine the sample size we would need to get a significant result.

The boxplot shows the difference in body weight between sports, using all 2830 rows from the `athletes`

dataset. The difference between the groups looks quite small. Determine the sample size we would need to have an 80% chance of detecting a small (0.4) difference between these two samples. `statsmodels.stats.power`

and `pandas`

have been loaded for you as `pwr`

and `pd`

.

Instructions

**100 XP**

- Set
`effect`

,`power`

, and`alpha`

to 0.4, 0.8 and 0.05, respectively. - Calculate the
`ratio`

using the relative lengths of the series for swimming (`swimmercount`

) compared to athletics (`athletecount`

) competitors. - Initialize the analysis, solve the equation for sample size, and print the output.