Exercise

# Random sampling

In this exercise, we're going to look at random sampling. You have been provided with a large dataset (`athletes`

) containing the details of a large number of American athletes. For the purposes of this exercise, we are interested in differences between the body `Weight`

of competitors in swimming and athletics. In order to test this, you'll be using a two-sample t-test. However, you will be performing this test on a random sample of the data. By playing with the random subset chosen, you'll see how randomness affects the results. You will need to extract a random subset of athletes from each event in order to run your test. `pandas`

, `scipy.stats`

, `plotnine`

, and `random`

have been loaded into the workspace as `pd`

, `stats`

, `p9`

, and `ran`

, respectively.

Instructions 1/2

**undefined XP**

- Set seed to 0000.
- Create two subset DataFrames (
`subsetathl`

and`subsetswim`

) from`athletes`

, with 30 random samples in each. - Perform a two-sample t-test on the
`Weight`

column of each subset DataFrame, save it to`t_result`

, then print it.