Significance of Difference of Estimates
A line plot with error bars gives you a rough idea of trends, but are the year-to-year differences statistically significant? In this exercise, you will determine significance of changing median home prices in Philadelphia. You will evaluate the differences from year to year between 2011 and 2017.
The formula for the two-sample Z-statistic is:
$$Z = \frac{x_1 - x_2}{\sqrt{SE_{x_1}^2 + SE_{x_2}^2}}$$
A DataFrame philly
is available with columns median_home_value
, median_home_value_moe
, and year
.
pandas
is imported as pd
, and the sqrt
function has been imported from the numpy
module.
This exercise is part of the course
Analyzing US Census Data in Python
Exercise instructions
- Set
x1
to the current year median home value, andx2
to the median home value for the prior year (current year minus 1) - Set
se_x1
to the current year MOE of the median home value divided byZ_CRIT
, andse_x2
to the same calculation for the prior year - Use Python's ternary operator (
result1 if condition else result2
) to return the empty string if the absolute value ofz
is greater thanZ_CRIT
, and otherwise return"not "
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Set the critical Z score for 90% confidence, prepare message
Z_CRIT = 1.645
msg = "Philadelphia median home values in {} were {}significantly different from {}."
for year in range(2012, 2018):
# Assign current and prior year's median home value to variables
x1 = int(philly[philly["year"] == ____]["median_home_value"])
x2 = int(____)
# Calculate standard error as 90% MOE / critical Z score
se_x1 = float(____)
se_x2 = float(____)
# Calculate two-sample z-statistic, output message if greater than critical Z score
z = (x1 - x2) / sqrt(se_x1**2 + se_x2**2)
print(msg.format(year, ____, year - 1))