When you train and test a model, you use data with values for the explanatory variables as well as the response variable. Training effectively creates a function that will take as inputs values for the explanatory variables and produce as output values corresponding to the response variable.

If the model is good, when provided with the inputs from the testing data, the outputs from the function will give results "close" to the response variable in the testing data. How to measure "close"? The first step is to subtract the function output from the actual response values in the testing data. The result is called the *prediction error* and there will be one such error for every case in the testing data. You then summarize that set of prediction errors.

What is a good way to summarize the set of prediction errors?