Exercise

Non-standard estimators

In the last exercise, you ran a simple bootstrap that we will now modify for more complicated estimators.

Suppose you are studying the health of students. You are given the height and weight of 1000 students and are interested in the median height as well as the correlation between height and weight and the associated 95% CI for these quantities. Let's use bootstrapping.

Examine the pandas DataFrame df with the heights and weights of 1000 students. Using this, calculate the 95% CI for both the median height as well as the correlation between height and weight.

Instructions

100 XP
  • Use the .sample() method ondf to generate a sample of the data with replacement and assign it to tmp_df.
  • For each generated dataset in tmp_df, calculate the median heights and correlation between heights and weights using .median() and .corr().
  • Append the median heights to height_medians and correlation to hw_corr.
  • Finally calculate the 95% ([2.5, 97.5]) confidence intervals for each of the above quantities using np.percentile().