Least-Squares with `statsmodels`
Several python libraries provide convenient abstracted interfaces so that you need not always be so explicit in handling the machinery of optimization of the model.
As an example, in this exercise, you will use the statsmodels library in a more high-level, generalized work-flow for building a model using least-squares optimization (minimization of RSS).
To help get you started, we've pre-loaded the data from x_data, y_data = load_data() and stored it in a pandas DataFrame with column names x_column and y_column using df = pd.DataFrame(dict(x_column=x_data, y_column=y_data))
Cet exercice fait partie du cours
Introduction to Linear Modeling in Python
Instructions
- Construct a model
ols()with formulaformula="y_column ~ x_column"and datadata=df, and then.fit()it to the data. - Use
model_fit.predict()to gety_modelvalues. - Using the provided function
plot_data_with_model(), over-plot they_datawithy_model. - Extract the model parameter values
a0anda1frommodel_fit.params. - Use
compute_rss_and_plot_fit()to confirm these results are consistent with the analytic formulae implemented withnumpy.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Pass data and `formula` into ols(), use and `.fit()` the model to the data
model_fit = ols(____="y_column ~ x_column", ____=df).____()
# Use .predict(df) to get y_model values, then over-plot y_data with y_model
y_model = model_fit.____(df)
fig = plot_data_with_model(x_data, ____, ____)
# Extract the a0, a1 values from model_fit.params
a0 = model_fit.____['Intercept']
a1 = model_fit.____['x_column']
# Visually verify that these parameters a0, a1 give the minimum RSS
fig, rss = compute_rss_and_plot_fit(a0, a1)