Least-Squares with `numpy`
The formulae below are the result of working through the calculus discussed in the introduction. In this exercise, we'll trust that the calculus correct, and implement these formulae in code using numpy
.
$$ a_{1} = \frac{ covariance(x, y) }{ variance(x) } $$ $$ a_{0} = mean(y) - a_{1} mean(x) $$
This exercise is part of the course
Introduction to Linear Modeling in Python
Exercise instructions
- Compute the means and deviations of the two variables
x, y
from the preloaded data. - Use
np.sum()
to complete the least-squares formulae, and use them to compute the optimal values fora0
anda1
. - Use
model()
to build the model valuesy_model
from those optimal slope a1 and intercept a0 values. - Use the pre-defined
compute_rss_and_plot_fit()
to visually confirm that this optimal model fits the data.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# prepare the means and deviations of the two variables
x_mean = np.____(x)
y_mean = np.____(y)
x_dev = x - ____
y_dev = y - ____
# Complete least-squares formulae to find the optimal a0, a1
a1 = np.sum(____ * ____) / np.sum( np.square(____) )
a0 = ____ - (a1 * ____)
# Use the those optimal model parameters a0, a1 to build a model
y_model = model(x, ____, ____)
# plot to verify that the resulting y_model best fits the data y
fig, rss = compute_rss_and_plot_fit(a0, a1)