Visualizing the P-Value
In this exercise, you will visualize the p-value, the chance that the effect (or "speed") we estimated, was the result of random variation in the sample. Your goal is to visualize this as the fraction of points in the shuffled test statistic distribution that fall to the right of the mean of the test statistic ("effect size") computed from the unshuffled samples.
To get you started, we've preloaded the group_duration_short and group_duration_long and functions compute_test_statistic(), shuffle_and_split(), and plot_test_statistic_effect()
Cet exercice fait partie du cours
Introduction to Linear Modeling in Python
Instructions
- Use
compute_test_statistic()to gettest_statistic_unshuffledfrom thegroup_duration_shortandgroup_duration_long; then usenp.mean()to compute effect size. - Use
shuffle_and_split()to createshuffle_half1andshuffle_half2, and usecompute_test_statistic()to compute thetest_statistic_shuffled. - Create a boolean mask
conditiontest_statistic_shuffledvalues are greater than or equal toeffect_size, then use this mask to compute thep_value. - Print the
p_valueand plot both test statistics usingplot_test_statistic_effect().
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Compute the test stat distribution and effect size for two population groups
test_statistic_unshuffled = compute_test_statistic(____, ____)
effect_size = np.mean(____)
# Randomize the two populations, and recompute the test stat distribution
shuffled_half1, ____ = shuffle_and_split(group_duration_short, ____)
test_statistic_shuffled = compute_test_statistic(shuffled_half1, ____)
# Compute the p-value as the proportion of shuffled test stat values >= the effect size
condition = ____ >= ____
p_value = len(test_statistic_shuffled[____]) / len(test_statistic_shuffled)
# Print p-value and overplot the shuffled and unshuffled test statistic distributions
print("The p-value is = {}".format(____))
fig = plot_test_stats_and_pvalue(test_statistic_unshuffled, test_statistic_shuffled)