Plotting bootstrap regressions
A nice way to visualize the variability we might expect in a linear regression is to plot the line you would get from each bootstrap replicate of the slope and intercept. Do this for the first 100 of your bootstrap replicates of the slope and intercept (stored as bs_slope_reps
and bs_intercept_reps
).
This exercise is part of the course
Statistical Thinking in Python (Part 2)
Exercise instructions
- Generate an array of \(x\)-values consisting of
0
and100
for the plot of the regression lines. Use thenp.array()
function for this. - Write a
for
loop in which you plot a regression line with a slope and intercept given by the pairs bootstrap replicates. Do this for100
lines.- When plotting the regression lines in each iteration of the
for
loop, recall the regression equationy = a*x + b
. Here,a
isbs_slope_reps[i]
andb
isbs_intercept_reps[i]
. - Specify the keyword arguments
linewidth=0.5
,alpha=0.2
, andcolor='red'
in your call toplt.plot()
.
- When plotting the regression lines in each iteration of the
- Make a scatter plot with
illiteracy
on the x-axis andfertility
on the y-axis. Remember to specify themarker='.'
andlinestyle='none'
keyword arguments. - Label the axes, set a 2% margin, and show the plot. This has been done for you, so hit submit to visualize the bootstrap regressions!
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Generate array of x-values for bootstrap lines: x
x = ____
# Plot the bootstrap lines
for i in range(____):
_ = plt.plot(____,
____*x + ____,
____=0.5, ____=0.2, ____='red')
# Plot the data
_ = ____
# Label axes, set the margins, and show the plot
_ = plt.xlabel('illiteracy')
_ = plt.ylabel('fertility')
plt.margins(0.02)
plt.show()