Linear regression algorithm
To truly understand linear regression, it is helpful to know how the algorithm works. The code for ols()
is hundreds of lines because it has to work with any formula and any dataset. However, in the case of simple linear regression for a single dataset, you can implement a linear regression algorithm in just a few lines of code.
The workflow is:
- First, write a function to calculate the sum of squares using this general syntax:
def function_name(args):
# some calculations with the args
return outcome
- Second, use
scipy
's minimize function find the coefficients that minimize this function.
The explanatory values (the n_convenience
column of taiwan_real_estate
) are available as x_actual
.
The response values (the price_twd_msq
column of taiwan_real_estate
) are available as y_actual
.
minimize()
is also loaded.
This exercise is part of the course
Intermediate Regression with statsmodels in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Complete the function
def calc_sum_of_squares(coeffs):
# Unpack coeffs
____, ____ = ____
# Calculate predicted y-values
y_pred = ____ + ____ * ____
# Calculate differences between y_pred and y_actual
y_diff = ____ - ____
# Calculate sum of squares
sum_sq = ____
# Return sum of squares
return sum_sq
# Test the function with intercept 10 and slope 1
print(calc_sum_of_squares([10, 1]))