Get startedGet started for free

Bootstrap hypothesis test

The permutation test has a pretty restrictive hypothesis, that the heterozygotic and wild type bout lengths are identically distributed. Now, use a bootstrap hypothesis test to test the hypothesis that the means are equal, making no assumptions about the distributions.

This exercise is part of the course

Case Studies in Statistical Thinking

View Course

Exercise instructions

  • Make an array, bout_lengths_concat, that contains all of the bout lengths for both wild type (bout_lengths_wt) and heterozygote (bout_lengths_het) using np.concatenate().
  • Compute the mean of all bout lengths from this concatenated array (bout_lengths_concat), storing the results in the variable mean_bout_length.
  • Shift both datasets such that they both have the same mean, namely mean_bout_length. Store the shifted arrays in variables wt_shifted and het_shifted.
  • Use dcst.draw_bs_reps() to draw 10,000 bootstrap replicates of the mean for each of the shifted datasets. Store the respective replicates in bs_reps_wt and bs_reps_het.
  • Subtract bs_reps_wt from bs_reps_het to get the bootstrap replicates of the difference of means. Store the results in the variable bs_reps.
  • Compute the p-value, defining "at least as extreme as" to be that the difference of means under the null hypothesis is greater than or equal to that which was observed experimentally. The variable diff_means_exp from the last exercise is already in your namespace.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Concatenate arrays: bout_lengths_concat
bout_lengths_concat = ____((____, ____))

# Compute mean of all bout_lengths: mean_bout_length
mean_bout_length = ____

# Generate shifted arrays
wt_shifted = ____ - np.mean(____) + ____
het_shifted = ____ - ____ + ____

# Compute 10,000 bootstrap replicates from shifted arrays
bs_reps_wt = ____
bs_reps_het = ____

# Get replicates of difference of means: bs_replicates
bs_reps = ____ - ____

# Compute and print p-value: p
p = ____(____ >= ____) / len(____)
print('p-value =', p)
Edit and Run Code