A first distribution analysis

The names() function tells us there are quite a few variables in the data set, enough to do a very in-depth analysis. For this lab, we'll restrict our attention to just two of the variables: the above ground living area of the house in square feet (Gr.Liv.Area) and the sale price (SalePrice).

To save some effort throughout the lab, you'll create two objects area and price to store these variables. These will be available in every exercise of this lab if they are required.

This exercise is part of the course

Data Analysis and Statistical Inference

View Course

Exercise instructions

  • Create two objects area and price and assign to them the two variables (Gr.Liv.Area and SalePrice) we picked from our data frame.
  • Take a look at the distribution of area in the population of home sales by calculating the summary() and drawing a histogram of area.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# The ames data frame is already loaded into the workspace.
# Assign the variables:
area <-
price <-

# Calculate the summary and draw a histogram of area