Get startedGet started for free

How is it optimal?

The function np.polyfit() that you used to get your regression parameters finds the optimal slope and intercept. It is optimizing the sum of the squares of the residuals, also known as RSS (for residual sum of squares). In this exercise, you will plot the function that is being optimized, the RSS, versus the slope parameter a. To do this, fix the intercept to be what you found in the optimization. Then, plot the RSS vs. the slope. Where is it minimal?

This exercise is part of the course

Statistical Thinking in Python (Part 2)

View Course

Exercise instructions

  • Specify which values of the slope for which to compute the RSS. Use np.linspace() to get 200 points in the range between 0 and 0.1.
  • Initialize an array, rss, to contain the RSS using np.empty_like().
  • Write a for loop to compute the sum of RSS of the slope. Hint: the RSS is given by np.sum((y_data - a * x_data - b)**2). The variable b you computed in the last exercise is already in your namespace.
  • Plot the RSS versus slope. Be sure to label your axes.
  • Show your plot.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Specify slopes to consider: a_vals
a_vals = ____

# Initialize sum of square of residuals: rss
rss = ____

# Compute sum of square of residuals for each value of a_vals
for i, a in enumerate(a_vals):
    rss[i] = ____

# Plot the RSS
plt.plot(____, ____, '-')
plt.xlabel('slope (children per woman / percent illiterate)')
plt.ylabel('sum of square of residuals')

plt.show()
Edit and Run Code