From sample mean to population mean

1. From sample mean to population mean

Now we're going to study some patterns that we can observe in the sample mean when the sample size becomes larger. These patterns form the basis of the law of large numbers.

2. Sample mean review

Jakob Bernoulli developed the law of large numbers in his book Ars Conjectandi (1713). The law states that the sample mean tends to the expected value as the sample grows larger.

3. Sample mean review (Cont.)

For example, we calculate the sample mean of two values by adding the values and dividing by two.

4. Sample mean review (Cont.)

For three values, we add up the values and divide by three.

5. Sample mean review (Cont.)

If we have n samples, we add the n values and divide by n.

6. Sample mean review (Cont.)

As the sample becomes larger, the sample mean gets nearer to the population mean. Let's code a bit.

7. Generating the sample

To generate a sample of coin flips, we will use the binomial distribution. First we import the binom object and the describe method from scipy dot stats, then we generate the sample using binom dot rvs. We specify n as 1 coin flip and p as the probability of success (0.5 for a fair coin), then we specify the sample size as 250 and set random_state so we can reproduce our results. After that, we print the first 100 values from our samples.

8. Calculating the sample mean

To calculate the sample mean we pass the sample to describe dot mean. We specify samples from 0 to 10, and we see that for the first 10 values the sample mean is 0.6. Now let's see what this process looks like with an animation.

9. Sample mean of coin flips (Cont.)

In this animation you see how we take the sample mean for values from 2 to 250 using the describe method. The red line represents the population mean, in this case 0.5, and the blue line is the sample mean. As you'll notice, due to the randomness of the data, the sample mean fluctuates around the population mean -- but as more data becomes available, the sample mean approaches the population mean. Let's see another example with the normal distribution.

10. Sample mean of normal distribution

Now we have three animated plots. At the top left we have our sample data from a normal distribution. We use one dot for each sample. At the top right we've plotted a histogram of the sample data, and at the bottom we've plotted the sample mean. In all the plots the population mean is represented with a black line and the sample mean is drawn using a red line. You can see how the red line moves and gets nearer to the population mean as more data becomes available. Enjoy the animations for a bit, and get some perspective. Now let's move on and learn how to plot the sample mean with Python.

11. Plotting the sample mean

First we import the binom object and describe from scipy dot stats, along with matplotlib dot pyplot as plt. Then we initialize the variables, setting coin_flips to 1, p to 0.5, sample_size to 1000, and averages to an empty list.

12. Plotting the sample mean (Cont.)

Finally, we calculate the sample mean using describe from 0 to the i index that goes from 2 to sample_size plus 1. We store the result in the averages list using append, then we print the first 10 values.

13. Plotting the sample mean (Cont.)

We add a red line with plt dot axhline at the population mean and plot the averages. Then we add a legend in the upper-right corner and show our plot.

14. Sample mean plot

The result is this beautiful plot that shows the law of large numbers in action.

15. Let's practice!

Let's get some hands-on practice with the law of large numbers.