Least-Squares Optimization

1. Least-Squares Optimization

You've seen how to visually estimate the best values for slope and intercept that yield the "best" fit, by minimizing RSS. Minimization is a type of "optimization" problem. For linear models, there is an "analytic" formula that provides the quantitative solution to this optimization problem. By "analytic" we mean a formula that can generate an exact numerical value simply from direct substitution, requiring no approximation. For more complex models, there is no "analytic" solution like this. Such problems must be solved by numerical approximation or numeric integration. In this lesson, we'll see several ways to solve this optimization problem with code, using several tools from the python data science ecosystem. These tools work for linear models but can also be adapted to solve more complex models beyond this course.

2. Minima of RSS

Let's start with the plot again from a previous exercise. The goal is to find the minimum RSS, giving us the best model parameters, out of many possible values. It turns out that a nice algebraic expression can be found with little bit of calculus. We won't do the derivation here, but we will implement the resulting expressions in numpy and use them to build models. The result is that you can compute the optimized slope as the ratio of the covariance and the variance. And you can compute the optimized intercept as the mean of y minus the slope times the mean of x.

3. Optimized by Numpy

To implement these formulae in numpy, let's walk through it step by step. First compute the means of x and y. Second, compute the deviations by subtracting the mean for all values in each array. Third, compute the slope a1 as the covariance -- the sum of the product of the x and y deviations -- divided by the variance of x -- the sum of the product of the deviations of x with itself. Finally, we use the slope a1 and the means of x and y to compute the intercept a0.

4. Optimized by Scipy

Linear models are a special case. Algebraic formulae like those shown do NOT exist for more complex models. So we need to learn a few other tools to use for modeling. The `scipy.optimize` module can solve more general optimization problems, not just least-squares. To use it, we load our data and define a the model form as a function. We then use `curve_fit()`, passing in the model function, and the data. The `param_opt` output returned contains the model parameter values that minimize RSS, which can be indexed like a list.

5. Optimized by Statsmodels

The statsmodels module also includes an "ordinary least-squares" method, called `ols()`, for solving the same optimization problem. With `statsmodels`, it is easier to use a pandas DataFrame, so here we repack our data into a DataFrame before passing that container into the `ols()` method. A very unique feature of the `ols()` method is that it takes a string statement of the form 'y is proportional to x', using the column names from the DataFrame. In the same line, we call the `.fit()` method of the object created with `ols()`. We can use this fitted model to make predictions, or to just extract the optimal parameter values as shown.

6. Let's practice!

Now it's your turn to get familiar with using numpy, scipy, and statsmodels to solve optimization problems like least-squares.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.