Studying residuals
To implement a linear model you must study the residuals, which are the distances between the predicted outcomes and the data.
Three conditions must be met:
- The mean should be 0.
- The variance must be constant.
- The distribution must be normal.
We will work with data of test scores for two schools, A and B, on the same subject. model_A
and model_B
were fitted with hours_of_study_A
and test_scores_A
and hours_of_study_B
and test_scores_B
, respectively.
matplotlib.pyplot
has been imported as plt
, numpy
as np
and LinearRegression
from sklearn.linear_model
.
This exercise is part of the course
Foundations of Probability in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Scatterplot of hours of study and test scores
plt.scatter(____, ____)
# Plot of hours_of_study_values_A and predicted values
plt.plot(____, model_A.____(hours_of_study_values_A))
plt.title("Model A", fontsize=25)
plt.show()