Exercise

Least-Squares with `statsmodels`

Several python libraries provide convenient abstracted interfaces so that you need not always be so explicit in handling the machinery of optimization of the model.

As an example, in this exercise, you will use the statsmodels library in a more high-level, generalized work-flow for building a model using least-squares optimization (minimization of RSS).

To help get you started, we've pre-loaded the data from x_data, y_data = load_data() and stored it in a pandas DataFrame with column names x_column and y_column using df = pd.DataFrame(dict(x_column=x_data, y_column=y_data))

Instructions

100 XP
  • Construct a model ols() with formula formula="y_column ~ x_column" and data data=df, and then .fit() it to the data.
  • Use model_fit.predict() to get y_model values.
  • Using the provided function plot_data_with_model(), over-plot the y_data with y_model.
  • Extract the model parameter values a0 and a1 from model_fit.params.
  • Use compute_rss_and_plot_fit() to confirm these results are consistent with the analytic formulae implemented with numpy.