Fitting a linear regression model
An anonymous salary survey has been conducted annually since 2015 among European IT specialists. In 2018, hundreds of respondents volunteered to participate. Included in the survey data are the number of years of experience respondents had and their current salary.
You are going to analyze the relationship between these two variables to find out if more years of experience results in higher or lower salary.
Your independent variable is experience_years
, and your dependent variable is current_salary
.
The data has been loaded for you as data
, along with statsmodels.api
and pandas
, as sm
and pd
, respectively.
This exercise is part of the course
Analyzing Survey Data in Python
Exercise instructions
- Define the variables,
x
andy
. - Add the constant term.
- Perform the
OLS()
regression and.fit()
the model. - Print the summary table.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Define variable, x and y
x = salary_survey.____.____
y = salary_survey.____.____
# Add the constant term
x = ____.____(x)
# Perform .OLS() regression and fit
result = ____.____(y,x).____()
# Print the summary table
print(____.____())