Analyze the summary of linear model
Analyzing the performance of the different imputed models is one of the most significant tasks in dealing with missing data. It determines, the type of imputed DataFrame you can rely upon. For analysis, you can fit a linear regression model on the imputed DataFrame and check for various parameters that impact the selection of the imputation type.
In this exercise, you have already been loaded with the DataFrame diabetes_cc
which is the complete case of diabetes DataFrame. The complete case acts as a base for comparison against other imputed DataFrames. You will use the package statsmodels.api
loaded as sm
for creating a linear regression model and generating summaries.
This exercise is part of the course
Dealing with Missing Data in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Add constant to X and set X & y values to fit linear model
X = sm.add_constant(___)
y = ___
lm = sm.OLS(y, X).fit()