Creating a sampling distribution
1. Creating a sampling distribution
You just saw how point estimates like the sample mean will vary depending on which rows end up in the sample.2. Same code, different answer
For example, this same code to calculate the mean cup points from a simple random sample of thirty coffees gives a slightly different answer each time. Let's try to visualize and quantify this variation.3. Same code, 1000 times
Base-R's replicate function let's you run the same code multiple times. It's especially useful for situations like this where the result contains some randomness. The first argument, n, is the number of times to run the code, and the second argument, expr, is the code to run. Each time the code is run, we get one sample mean, so running the code a thousand times gives us a vector of a thousand sample means.4. Preparing for plotting
To use our results with ggplot, we need to put them inside a data frame or tibble.5. Distribution of sample means for size 30
Our thousand sample means form a distribution of sample means. To visualize a distribution, the best plot is often a histogram. Here you can see that most of the results lie between eighty one and eighty three, and they follow roughly a bell curve shape, like a normal distribution. There's an important piece of jargon you need to know here. A distribution of several replicates of sample means, or other point estimates is known as a sampling distribution.6. Different sample sizes
Here are histograms from running the same code again with different sample sizes. When we decrease the sample size by a factor of five to six, you can see that the range of the results is broader. The bulk of the results now lie between eighty and eighty four. On the other hand, increasing the sample size by a factor of five to one hundred and fifty results in a narrower range. Now most of the results are between eighty one point seven and eighty two point six. As you saw previously, bigger sample sizes give you more accurate results. By replicating the sampling many times, as we have done here, you can quantify that accuracy.7. Let's practice!
Are you ready to replicate?Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.