Generating a single permutation
In the next few exercises, we will run a significance test using permutation testing. As discussed in the video, we want to see if there's any difference in the donations generated by the two designs - A and B. Suppose that you have been running both the versions for a few days and have generated 500 donations on A and 700 donations on B, stored in the variables donations_A
and donations_B
.
We first need to generate a null distribution for the difference in means. We will achieve this by generating multiple permutations of the dataset and calculating the difference in means for each case.
First, let's generate one permutation and calculate the difference in means for the permuted dataset.
This exercise is part of the course
Statistical Simulation in Python
Exercise instructions
- Concatenate the two arrays
donations_A
anddonations_B
usingnp.concatenate()
and assign todata
. - Get a single permutation using
np.random.permutation()
and assign it toperm
. - Calculate the difference in the mean values of
permuted_A
andpermuted_B
asdiff_in_means
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Concatenate the two arrays donations_A and donations_B into data
len_A, len_B = len(donations_A), len(donations_B)
data = ____([donations_A, donations_B])
# Get a single permutation of the concatenated length
perm = ____(len(donations_A) + len(donations_B))
# Calculate the permutated datasets and difference in means
permuted_A = data[perm[:len(donations_A)]]
permuted_B = data[perm[len(donations_A):]]
diff_in_means = ____
print("Difference in the permuted mean values = {}.".format(diff_in_means))