Sampling in Google Sheets

Thus far, you've calculated statistics with all the data presented. Usually, it's more practical to create a sample from among a population.

  • Population: An entire collection of observations or events
  • Sample: A subset from a population meant to represent the population

There are many ways to sample data. Sampling choices affect the statistics and as a result, what you learn about the data population. A popular method for sampling is to do it randomly.

This data has an extra column of randomly generated numbers that was created with the RAND() function, which can be empty and returns numbers between 0 and 1. Once the second column has been created, sort the data array by the randomly created column. Then when constructing visuals or stats based on the sample, you can control the cell range. In this exercise you will randomly sample 20 observations of monthly US airline passenger miles, and then create a histogram of this sample to understand the distribution.

This exercise is part of the course

Introduction to Statistics in Google Sheets

View Course

Exercise instructions

  • In cell E2, calculate the mean of the first 20 observations.
  • In cell F2, calculate the standard deviation of the same 20 observations.
  • Insert a histogram of A2:A21 in the highlighted region. Examine how "normal" this visual looks.

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise