Inference on coefficients
Using the NYC Italian restaurants dataset (compiled by Simon Sheather in A Modern Approach to Regression with R), restNYC
, you will investigate the effect on the significance of the coefficients when there are multiple variables in the model. Recall, the p-value associated with any coefficient is the probability of the observed data given that the particular variable is independent of the response AND given that all other variables are included in the model.
The following information relates to the dataset restNYC
which is loaded into your workspace:
- each row represents one customer survey from Italian restaurants in NYC
- Price = price (in US$) of dinner (including tip and one drink)
- Service = rating of the service (from 1 to 30)
- Food = rating of the food (from 1 to 30)
- Decor = rating of the decor (from 1 to 30)
This exercise is part of the course
Inference for Linear Regression in R
Exercise instructions
- Run a
tidy
lm
regressingPrice
onService
. - Run a
tidy
lm
regressingPrice
onService
,Food
, andDecor
. - What happened to the significance of
Service
when additional variables were added to the model?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Output the first model
# Output the second model