Get startedGet started for free

R-Squared

Previously, we expressed another measure of goodness, R-squared, in terms of a ratio of RSS to VAR. Multiplying top and bottom of the ratio by 1/n, the numerical equivalent form can be seen as the ratio of the variance of the residuals divided by the variance of the linear trend in the data we are modeling. This can be interpreted as a measure of how much of the variance in your data is "explained" by your model, in contrast to the spread or variance of the residuals (after you've removed the linear trend).

Here we have pre-loaded the data x_data,y_data and the model predictions y_model for the best fit model; your goal is to compute the R-squared measure to quantify how much this linear model accounts for variation in the data.

This exercise is part of the course

Introduction to Linear Modeling in Python

View Course

Exercise instructions

  • Compute the residuals, by subtracting the y_data from the y_model, and the deviations, by subtracting the y_data from the np.mean() of the y_data.
  • Compute the variance of the residuals and the variance of the deviations, using np.mean() and np.square() to each.
  • Compute the r_squared as 1 minus the ratio var_residuals / var_deviations, and print the result.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Compute the residuals and the deviations
residuals = ____ - y_data
deviations = np.____(____) - y_data

# Compute the variance of the residuals and deviations
var_residuals = np.____(np.____(____))
var_deviations = np.____(np.____(____))

# Compute r_squared as 1 - the ratio of RSS/Variance
r_squared = 1 - (____ / ____)
print('R-squared is {:0.2f}'.format(____))
Edit and Run Code