1. Learn
  2. /
  3. Courses
  4. /
  5. Intermediate Regression with statsmodels in Python

Exercise

Fitting a parallel slopes linear regression

In Introduction to Regression with statsmodels in Python, you learned to fit linear regression models with a single explanatory variable. In many cases, using only one explanatory variable limits the accuracy of predictions. To truly master linear regression, you need to be able to fit regression models with multiple explanatory variables.

The case when there is one numeric explanatory variable and one categorical explanatory variable is sometimes called a "parallel slopes" linear regression due to the shape of the predictions — more on that in the next exercise.

Here, you'll revisit the Taiwan real estate dataset. Recall the meaning of each variable.

Variable Meaning
dist_to_mrt_station_m Distance to nearest MRT metro station, in meters.
n_convenience No. of convenience stores in walking distance.
house_age_years The age of the house, in years, in 3 groups.
price_twd_msq House price per unit area, in New Taiwan dollars per meter squared.

taiwan_real_estate is available.

Instructions 1/3

undefined XP
  • 1
    • Import ols()from statsmodels.formula.api.
    • Using the taiwan_real_estate dataset, model and fit the house price (in TWD per square meter) versus the number of nearby convenience stores.
    • Print the coefficients of the model.
  • 2
    • Model the house price (in TWD per square meter) versus the house age (in years). Don't include an intercept term.
    • Print the coefficients of the model.
  • 3
    • Model the house price versus the number of nearby convenience stores plus the house age. Don't include an intercept term.
    • Print the coefficients of the model.