Influence
Influence measures how much a model would change if each observation was left out of the model calculations, one at a time. That is, it measures how different the prediction line would look if you ran a linear regression on all data points except that point, compared to running a linear regression on the whole dataset.
The standard metric for influence is Cook's distance, which calculates influence based on the size of the residual and the leverage of the point.
Here you can see the same model as last time: house price versus the square root of distance from the nearest MRT station in the Taiwan real estate dataset.
Guess which observations you think will have a high influence, then move the slider to find out.
Which statement is true?
Cet exercice fait partie du cours
Introduction to Regression in R
Exercice interactif pratique
Passez de la théorie à la pratique avec l’un de nos exercices interactifs
