Calculate the variance manually
As a reminder, we use the following process to calculate the sample variance:
- Calculate the sample mean
- Calculate the squared difference between each data point and the sample mean
- Sum these squared differences (i.e. compute the sum of squares)
- Divide the sum of squares by \(N-1\) (i.e. the sample size minus 1)
Let's calculate the sample variance of Michael Jordan's points per game!
This exercise is part of the course
Intro to Statistics with R: Introduction
Exercise instructions
The dataset data_jordan
is loaded into your workspace.
- Calculate the mean points per game and save the result to
mean_ppg
. - Subtract the mean points per game from the vector of points scored in each game and assign the result to
diff
. - Square this vector of differences and save to
squared_diff
. - Calculate the sample variance by summing the values in
squared_diff
withsum()
and dividing by the sample size minus 1 usinglength()
to count the number of games in the sample. Just print the result without saving it. - Check your result by calculating the variance with R's built-in
var()
function.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
## The dataset `data_jordan` is already loaded
# Calculate mean points per game
mean_ppg <- ___
# Calculate deviations from mean
diff <- ___
# Calculate squared deviations
squared_diff <- ___
# Combine everything to compute sample variance
# Compare with the result of var()