Get startedGet started for free

Motivations

1. Motivation

Hi, I'm Eric Eager. I am an applied mathematician and data scientist. In this course, you will dive into one of the more important collections of mathematical tools for data science - linear algebra. Linear algebra is the study of linear operations on mathematical objects such as vectors, matrices, and tensors, where data generally lives.

2. Data - The Atom of Data Science

Data needs to be stored somewhere, and many of the objects in which we store data (spreadsheets, tables, data frames). Can be broken down into one of the main objects studied in linear algebra - vectors and matrices. Here I'm looking at the first few rows of a dataset with has athletic data for players entering the National Football League draft. This dataset has one case in each row (player, with name omitted) and one feature (or variable, like height) in each column.

3. Vectors - Storing Univariate Data

The most basic, nontrivial, object of linear algebra is a vector, which is an n-dimensional collection of elements (usually numbers). Examples of both column and row vectors can be found above in x and y, respectively. The arrow above x and y signify that they are vectors. The transpose of a column vector is simply the same vector made into a row vector, and vice versa, with a superscript T denoting transposition.

4. Vectors - Storing Univariate Data

Aside from reading in lists of data from external sources, you can create vectors by hand a few different ways. One way is to repeat the same element a number of times using the "rep" function. Here, the vector x is assigned four 1's by the rep(1, 4) command. Another way is to create a vector that has a pattern to it. Here, the vector y is assigned the numbers 2, 4, 6, and 8 by the seq(2, 8, by = 2). You can simply put all of the elements you want in a vector of interest using the "c" command (for concatenate). This is not practical for large problems, but will be useful as you learn linear algebra here. Here, z is given the numbers 1, 5, -2 and 4 using the c(1, 5, -2, 4) command. Lastly, elements of a vector can be changed individually by selecting an index and using the assignment arrow. For example, to change the third element of z to 7, simply write z[3] <- 7.

5. Matrices - Storing Tables of Data

Matrices are simply the superimposition of vectors with other vectors. An m by n matrix can be thought of as the superimposition of n m-dimensional column vectors or m n-dimensional row vectors. For example, the matrix A above is a 5 by 2 matrix.

6. Matrices - Storing Tables of Data

Many data frames can be viewed as matrices, with cases acting as rows and features acting as columns. Here in this dataset you can explicitly see cases A through F living in the first five rows of this dataset, while variables 1 and 2 live in the second and third column.

7. Matrices - Storing Tables of Data

You can create matrices by hand a few different ways. One way is to repeat the same element for a certain number of rows and columns using the "matrix" function. The first argument is the number of rows, the second is the number of columns in this case. Here, the 3 by 2 matrix of all 2's is made via the matrix(2, 3, 2) command. Also as with vectors, you can create matrices element-by-element using the c command. It is best to specify the number of rows and columns in each case using "nrow" and "ncol" commands. Here a matrix is made using matrix(c(1, -1, 2, 3, 2, -2), nrow = 2, ncol = 3, byrow = TRUE) "byrow" tells R to make matrix each row at a time. The default for the "byrow" command is FALSE. Lastly, as with vectors you can manually change any element of a matrix using the assignment arrow. Here the row index is followed by a comma and the column index. Here, changing the 2, 1 element of a matrix A can be done with the command A[2, 1] <- 100.

8. Let's practice!

Time to put this into practice.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.