Exercise

A first distribution analysis

The names() function tells us there are quite a few variables in the data set, enough to do a very in-depth analysis. For this lab, we'll restrict our attention to just two of the variables: the above ground living area of the house in square feet (Gr.Liv.Area) and the sale price (SalePrice).

To save some effort throughout the lab, you'll create two objects area and price to store these variables. These will be available in every exercise of this lab if they are required.

Instructions

100 XP
  • Create two objects area and price and assign to them the two variables (Gr.Liv.Area and SalePrice) we picked from our data frame.
  • Take a look at the distribution of area in the population of home sales by calculating the summary() and drawing a histogram of area.