Exercise

# Modeling log-transformed monetary output

In this exercise, you will practice modeling on log-transformed monetary output, and then transforming the "log-money" predictions back into monetary units. The data loaded into your workspace records subjects' incomes in 2005 (`Income2005`

), as well as the results of several aptitude tests taken by the subjects in 1981:

`Arith`

`Word`

`Parag`

`Math`

`AFQT`

(Percentile on the Armed Forces Qualifying Test)

The data have already been split into training and test sets (`income_train`

and `income_test`

respectively) and are in the workspace. You will build a model of log(income) from the inputs, and then convert log(income) back into income.

Instructions

**100 XP**

- Call
`summary()`

on`income_train$Income2005`

to see the summary statistics of income in the training set. - Write a formula to express
`log(Income2005)`

as a function of the five tests as the variable`fmla.log`

. Print it. - Fit a linear model of
`log(Income2005)`

to the`income_train`

data:`model.log`

. - Use
`model.log`

to predict income on the`income_test`

dataset. Put it in the column`logpred`

.- Check
`summary()`

of`logpred`

to see that the magnitudes are much different from those of`Income2005`

.

- Check
- Reverse the log transformation to put the predictions into "monetary units":
`exp(income_test$logpred)`

.- Check
`summary()`

of`pred.income`

and see that the magnitudes are now similar to`Income2005`

magnitudes.

- Check
- Fill in the blanks to plot a scatter plot of predicted income vs income on the test set.