Visualizing permutation sampling
To help see how permutation sampling works, in this exercise you will generate permutation samples and look at them graphically.
We will use the Sheffield Weather Station data again, this time considering the monthly rainfall in June (a dry month) and November (a wet month). We expect these might be differently distributed, so we will take permutation samples to see how their ECDFs would look if they were identically distributed.
The data are stored in the NumPy arrays rain_june
and rain_november.
As a reminder, permutation_sample()
has a function signature of permutation_sample(data_1, data_2)
with a return value of permuted_data[:len(data_1)], permuted_data[len(data_1):]
, where permuted_data = np.random.permutation(np.concatenate((data_1, data_2)))
.
Diese Übung ist Teil des Kurses
Statistical Thinking in Python (Part 2)
Anleitung zur Übung
- Write a
for
loop to generate 50 permutation samples, compute their ECDFs, and plot them.- Generate a permutation sample pair from
rain_june
andrain_november
using yourpermutation_sample()
function. - Generate the
x
andy
values for an ECDF for each of the two permutation samples for the ECDF using yourecdf()
function. - Plot the ECDF of the first permutation sample (
x_1
andy_1
) as dots. Do the same for the second permutation sample (x_2
andy_2
).
- Generate a permutation sample pair from
- Generate
x
andy
values for ECDFs for therain_june
andrain_november
data and plot the ECDFs using respectively the keyword argumentscolor='red'
andcolor='blue'
. - Label your axes, set a 2% margin, and show your plot. This has been done for you, so just hit submit to view the plot!
Interaktive Übung zum Anfassen
Probieren Sie diese Übung aus, indem Sie diesen Beispielcode ausführen.
for _ in ____:
# Generate permutation samples
perm_sample_1, perm_sample_2 = ____
# Compute ECDFs
x_1, y_1 = ____
x_2, y_2 = ____
# Plot ECDFs of permutation sample
_ = plt.plot(____, ____, marker='.', linestyle='none',
color='red', alpha=0.02)
_ = plt.plot(____, ____, marker='.', linestyle='none',
color='blue', alpha=0.02)
# Create and plot ECDFs from original data
x_1, y_1 = ____
x_2, y_2 = ____
_ = plt.plot(x_1, y_1, marker='.', linestyle='none', color='red')
_ = plt.plot(x_2, y_2, marker='.', linestyle='none', color='blue')
# Label axes, set margin, and show plot
plt.margins(0.02)
_ = plt.xlabel('monthly rainfall (mm)')
_ = plt.ylabel('ECDF')
plt.show()