Bootstrapping vs. normality
You've seen the results of a bootstrap confidence interval for Pearson's R. But what about common situations like making a confidence interval for a mean? Why would you use a bootstrap confidence interval over a "normal" confidence interval coming from stats.norm
?
A DataFrame showing investments from venture capital firms (investments_df
) has been loaded for you, as have the packages pandas as pd
, NumPy as np
, and stats
from SciPy.
This exercise is part of the course
Foundations of Inference in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Select just the companies in the Analytics market
analytics_df = ____[____ == 'Analytics']
# Confidence interval using the stats.norm function
norm_ci = stats.norm.____(alpha=____,
loc=____,
scale=____.std() / np.___(____))
# Construct a bootstrapped confidence interval
bootstrap_ci = stats.bootstrap(data=(____, ),
statistic=np.____)
print('Normal CI:', norm_ci)
print('Bootstrap CI:', bootstrap_ci.confidence_interval)