Exercise

Model fitting step-by-step

In the video lecture, you learned the key components for fitting a GLM in Python using the statsmodels package. In this exercise you will define the components of the GLM step by step and finally fit the model by calling the .fit() method.

The dataset which you will use is on the contamination of groundwater with arsenic in Bangladesh where we want to model the household decision on switching the current well.
The columns in the dataset are:

  • switch: 1 if the change of the current well occurred; 0 otherwise
  • arsenic: The level of arsenic contamination in the well
  • distance: Distance to the closest known safe well
  • education: Years of education of the head of the household

Dataset wells has been preloaded in the workspace.

Instructions 1/3

undefined XP
    1
    2
    3
  • Create a regression formula where switch is predicted by distance100.