Train variation
Time to work with US monthly train ridership data! Let's begin by exploring the variation in monthly train ridership. To understand a dataset beyond averages, the variance is an extremely useful statistic. As the name suggests, it gives a sense for the variation that exists in the data. That is, how far from the mean each point is.
As a reminder, to calculate the variance of a population:
- Calculate the mean of the entire dataset.
- Subtract each value from the mean.
- Square the differences to ensure positive and negative values don't cancel each other out.
- Take the average of the squared differences.
To fully understand variance, in this exercise you will first follow the above steps to calculate the variance manually and then use the VARP()
function to automatically calculate variance.
This exercise is part of the course
Introduction to Statistics in Google Sheets
Exercise instructions
- In cell
D2
, calculate the difference betweenB2
and the mean train ridership (AVERAGE($B$2:$B$160)
). Do the same for the rest of the column. - In column E, square each of the differences in column D using
^2
. - Calculate the variance by calculating the mean of
E2:E160
inF2
. - Use
VARP()
onB2:B160
to concisely calculate the variance.
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
