Get startedGet started for free

Linear regression on appropriate Anscombe data

For practice, perform a linear regression on the data set from Anscombe's quartet that is most reasonably interpreted with linear regression.

This exercise is part of the course

Statistical Thinking in Python (Part 2)

View Course

Exercise instructions

  • Compute the parameters for the slope and intercept using np.polyfit(). The Anscombe data are stored in the arrays x and y.
  • Print the slope and intercept.
  • Generate theoretical \(x\) and \(y\) data from the linear regression. Your \(x\) values should consist of 3 and 15.
  • Plot the Anscombe data as a scatter plot and the theoretical line.
  • Label the axes (just \(x\) and \(y\) will do).
  • Show your plot.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Perform linear regression: a, b
a, b = ____

# Print the slope and intercept
print(____, ____)

# Generate theoretical x and y data: x_theor, y_theor
x_theor = np.array([____, ____])
y_theor = ____

# Plot the Anscombe data and theoretical line
_ = ____
_ = ____

# Label the axes
plt.xlabel('x')
plt.ylabel('y')

# Show the plot
plt.show()
Edit and Run Code