Test Statistics and Effect Size

How can we explore linear relationships with bootstrap resampling? Back to the trail! For each hike plotted as one point, we can see that there is a linear relationship between total distance traveled and time elapsed. It we treat the distance traveled as an "effect" of time elapsed, then we can explore the underlying connection between linear regression and statistical inference.

In this exercise, you will separate the data into two populations, or "categories": early times and late times. Then you will look at the differences between the total distance traveled within each population. This difference will serve as a "test statistic", and it's distribution will test the effect of separating distances by times.

Use numpy "logical indexing", e.g. sample_distances[sample_times < 5], to separate the sample distances into early and late time populations.
Use np.random.choice() with replacement=True to create a resample for each of the two time bins.
Compute the test_statistic array as the resample_long - resample_short, and find and print the effect size and uncertainty with np.mean(), np.std().
Plot the test_statistic distribution, using the predefined fig = plot_test_statistic().

script.py

IPython Shell

Exploring Linear Trends

Building Linear Models

Making Model Predictions

Estimating Model Parameters

Exercise

Exercise

Test Statistics and Effect Size

Instructions