1. What makes a model linear
We have explored linear relationships, with some motivating examples, with data visualization, and with descriptive statistics.
In this chapter, we will build our own linear models.
We'll start by asking "what does 'linear' mean"?
We'll build models from linear components and then see how to find the optimal parameters for those components, so that our model best fits our data.
2. Taylor Series
To "see" the model in terms of components, we start with the concept of a Taylor Series.
Taylor Series are so incredibly useful that they seem to pop up everywhere in math, science, and engineering.
For this course, there are 3 things you need to know about them:
First, they are used to approximate almost any curve.
Second, they are expressed as polynomials, meaning a series of terms, summed together, with each term being a product of a coefficient, and a power of x.
For example, a2 times x-squared is the second term. Those terms are referred to by "order", for example, a0 is the "zero-th order" term, a1*x is the "first order" term, and a2 times x-squared is the "Second order" term.
Third, in many applications, to obtain a very good approximation, you only need the zero-th and first order terms, for example y = a0 + a1*x.
Linear models can be thought of "First Order" models, where we have ignored the "nonlinear terms", the Second Order terms and higher.
3. Series Terms: a0=1
Let's see some examples of terms in a Taylor Series.
Case Zero: a0=1 and other terms are zero. The plot is a flat line `y=1`.
a0 is the intercept. At x=0, the line intercepts the y-axis at y=1.
4. Series Terms: a1=1
Case One: a0=0, and a1=1. The plot of "y = 0 + 1*x" is a slanted line, with constant slope
Here, a1 is the "slope", or "rise-over-run".
5. Series Terms: a2=1
Case Two: a0=0 and a1=0 but a2=1
The plot is of "y = 0 + 0*x + 1*x**2".
This case is nonlinear because a2 is NOT zero.
6. Combining all Terms
Now imagine summing all 3 terms in the series. The result is the wide black line.
That line is y = 1 + 1*x + 1*x**2, where a0=1, a1=1, and a2=1.
By selecting the right combination of parameter values a0,a1,a2... we can build up almost any shape.
7. Real Data
Let's look at real data, in this case global average sea level data, and see if we can we pick parameter values for a model to fit the data.
8. Zeroth Order
If we only use the zero-th order term, the line passes through the data, but for each year, there is a large difference between model and data for most years.
9. First Order
What if we use a First Order model, where both a0 and a1 and not zero?
This is a much better fit. Notice that we tried many values and found that a0=5 and a1=0.08 made the model fit the data well.
This is the challenge: finding the best parameters values to make the model fit the data.
10. Higher Order
Should we keep adding terms? If we add a second order term, the results are exciting. But there's a problem.
11. Over-fitting
You may get a better fit, but the trade off is that if you add more terms, you will suffer from what is called "over-fitting".
The symptom of over-fitting is that your model may NOT fit NEW data that was not used to "train" your model!
In this course, we'll stick with First Order, linear models.
12. Let's practice!
Now, let's apply these ideas.