Get startedGet started for free

A two-sample bootstrap hypothesis test for difference of means.

You performed a one-sample bootstrap hypothesis test, which is impossible to do with permutation. Testing the hypothesis that two samples have the same distribution may be done with a bootstrap test, but a permutation test is preferred because it is more accurate (exact, in fact). But therein lies the limit of a permutation test; it is not very versatile. We now want to test the hypothesis that Frog A and Frog B have the same mean impact force, but not necessarily the same distribution. This, too, is impossible with a permutation test.

To do the two-sample bootstrap test, we shift both arrays to have the same mean, since we are simulating the hypothesis that their means are, in fact, equal. We then draw bootstrap samples out of the shifted arrays and compute the difference in means. This constitutes a bootstrap replicate, and we generate many of them. The p-value is the fraction of replicates with a difference in means greater than or equal to what was observed.

The objects forces_concat and empirical_diff_means are already in your namespace.

This exercise is part of the course

Statistical Thinking in Python (Part 2)

View Course

Exercise instructions

  • Compute the mean of all forces (from forces_concat).
  • Generate shifted data sets for both force_a and force_b such that the mean of each is the mean of the concatenated array of impact forces.
  • Generate 10,000 bootstrap replicates of the mean each for the two shifted arrays. Use the draw_bs_reps() function you wrote.
  • Compute the bootstrap replicates of the difference of means by subtracting the replicates of the shifted impact force of Frog B from those of Frog A.
  • Compute and print the p-value from your bootstrap replicates.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Compute mean of all forces: mean_force
mean_force = ____

# Generate shifted arrays
force_a_shifted = ____
force_b_shifted = ____

# Compute 10,000 bootstrap replicates from shifted arrays
bs_replicates_a = ____
bs_replicates_b = ____

# Get replicates of difference of means: bs_replicates
bs_replicates = ____

# Compute and print p-value: p
p = ____ / ____
print('p-value =', p)
Edit and Run Code