Visualizing the P-Value
In this exercise, you will visualize the p-value, the chance that the effect (or "speed") we estimated, was the result of random variation in the sample. Your goal is to visualize this as the fraction of points in the shuffled test statistic distribution that fall to the right of the mean of the test statistic ("effect size") computed from the unshuffled samples.
To get you started, we've preloaded the group_duration_short
and group_duration_long
and functions compute_test_statistic()
, shuffle_and_split()
, and plot_test_statistic_effect()
This exercise is part of the course
Introduction to Linear Modeling in Python
Exercise instructions
- Use
compute_test_statistic()
to gettest_statistic_unshuffled
from thegroup_duration_short
andgroup_duration_long
; then usenp.mean()
to compute effect size. - Use
shuffle_and_split()
to createshuffle_half1
andshuffle_half2
, and usecompute_test_statistic()
to compute thetest_statistic_shuffled
. - Create a boolean mask
condition
test_statistic_shuffled
values are greater than or equal toeffect_size
, then use this mask to compute thep_value
. - Print the
p_value
and plot both test statistics usingplot_test_statistic_effect()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Compute the test stat distribution and effect size for two population groups
test_statistic_unshuffled = compute_test_statistic(____, ____)
effect_size = np.mean(____)
# Randomize the two populations, and recompute the test stat distribution
shuffled_half1, ____ = shuffle_and_split(group_duration_short, ____)
test_statistic_shuffled = compute_test_statistic(shuffled_half1, ____)
# Compute the p-value as the proportion of shuffled test stat values >= the effect size
condition = ____ >= ____
p_value = len(test_statistic_shuffled[____]) / len(test_statistic_shuffled)
# Print p-value and overplot the shuffled and unshuffled test statistic distributions
print("The p-value is = {}".format(____))
fig = plot_test_stats_and_pvalue(test_statistic_unshuffled, test_statistic_shuffled)