A first distribution analysis
The names()
function tells us there are quite a few variables in the data set, enough to do a very in-depth analysis. For this lab, we'll restrict our attention to just two of the variables: the above ground living area of the house in square feet (Gr.Liv.Area
) and the sale price (SalePrice
).
To save some effort throughout the lab, you'll create two objects area
and price
to store these variables. These will be available in every exercise of this lab if they are required.
This exercise is part of the course
Data Analysis and Statistical Inference
Exercise instructions
- Create two objects
area
andprice
and assign to them the two variables (Gr.Liv.Area
andSalePrice
) we picked from our data frame. - Take a look at the distribution of area in the population of home sales by calculating the
summary()
and drawing a histogram ofarea
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The ames data frame is already loaded into the workspace.
# Assign the variables:
area <-
price <-
# Calculate the summary and draw a histogram of area