In-sample RMSE for linear regression on diamonds
As you saw in the video, included in the course is the diamonds
dataset, which is a classic dataset from the ggplot2
package. The dataset contains physical attributes of diamonds as well as the price they sold for. One interesting modeling challenge is predicting diamond price based on their attributes using something like a linear regression.
Recall that to fit a linear regression, you use the lm()
function in the following format:
mod <- lm(y ~ x, my_data)
To make predictions using mod
on the original data, you call the predict()
function:
pred <- predict(mod, my_data)
This exercise is part of the course
Machine Learning with caret in R
Exercise instructions
- Fit a linear model on the
diamonds
dataset predictingprice
using all other variables as predictors (i.e.price ~ .
). Save the result tomodel
. - Make predictions using
model
on the full original dataset and save the result top
. - Compute errors using the formula \(errors = predicted - actual\). Save the result to
error
. - Compute RMSE using the formula you learned in the video and print it to the console.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Fit lm model: model
# Predict on full data: p
# Compute errors: error
# Calculate RMSE