Standardizing Your Data
In the lecture, we saw that you can learn a lot about a dataset by creating the matrix \(A^TA\) from it. In this exercise, you'll do that with athletic data for players entering the National Football League college draft. The dataset combine
is loaded for you.
This exercise is part of the course
Linear Algebra for Data Science in R
Exercise instructions
- Extract only the numerical elements of the data frame by taking only the 5th through 12th columns. Call this
A
(we cannot do math on the non-numerical components in columns 1 through 4). - Turn this data frame into a matrix by using the
as.matrix()
command. - Subtract the mean of each of the columns of the matrix.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Extract columns 5-12 of combine
A <- combine[, ___:___]
# Make A into a matrix
A <- ___(A)
# Subtract the mean of each column
A[, ___] <- A[, 1] - mean(A[, 1])
A[, 2] <- A[, 2] - ___(A[, 2])
A[, ___] <- A[, 3] - mean(A[, 3])
A[, ___] <- A[, ___] - mean(A[, 4])
A[, 5] <- A[, 5] - mean(A[, 5])
A[, ___] <- A[, 6] - mean(A[, ___])
A[, 7] <- A[, ___] - mean(A[, 7])
A[, ___] <- A[, 8] - mean(A[, 8])