1. Review and Preliminary Mortgage Analysis
Welcome to the final chapter of Scalable Data Processing in R. In this chapter we're going to take a closer look at the mortgage data set.
2. Overview of the chapter
We'll start by adjusting our tables for race and ethnicity by population to compare proportions of people receiving mortgages. Next, we'll show you a quick check to see if there are patterns of missingness in the data. Then we'll see how the mortgage demographic proportions change over time. Finally we'll look at the proportional change in city vs urban mortgage percentage as well as changes in the proportion of people securing federally guaranteed loans. So that you get practice with both bigmemory and iotools, we'll have exercises that use both.
3. United States Census Bureau Race and Ethnic Proportions
Here is the US racial and ethnic breakdown according to the United States Census Bureau. These percentages do not add up to 100. The remaining race categories, "two or more races" and "other race" are not included in the data set although they make up 3% and 6% respectively. It should also be noted while the first 6 designations are considered races, Hispanic or Latino is a designated ethnicity.
4. Proportional Borrowing
Now that we know the population proportions of the mortgage borrowers race and ethnicity we can take a look at the proportional borrowing among groups.
From the previous sections we know that most mortgages go to people who identify themselves as white. However, we also know that whites make up 72% of the population. So, it may not be surprising that this group has the most borrowers.
What we don't know is if whites are borrowing at a higher rate than other groups, that is, is the proportion of white people with a mortgage larger than the proportion of people with mortgages in another group?
5. Let's practice!
Let's find out.