CommencerCommencer gratuitement

Borrower Region by Year

In this exercise you'll tabulate the data by year and the msa (city vs rural) variable.

Cet exercice fait partie du cours

Scalable Data Processing in R

Afficher le cours

Instructions

All the required packages are loaded in your workspace.

  • Create a function make_table() that reads in chunk as a matrix and then tabulates it by borrower region (msa) and year.
  • Use chunk.apply() to import the data from the file connection we created for you.
  • Run the rest of the code to plot the changes in mortgages received by region.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Open a connection to the file and skip the header
fc <- file("mortgage-sample.csv", "rb")
readLines(fc, n = 1)

# Create a function to read chunks
make_table <- function(chunk) {
    # Create a matrix
    m <- ___(___, sep = ",", type = "integer")
    colnames(m) <- mort_names
    # Create the output table
    ___(___, c(___, ___))
}

# Import data using chunk.apply
msa_year_table <- ___

# Close connection
close(fc)

# Convert to a data frame
df_msa <- as.data.frame(msa_year_table)

# Rename columns
df_msa$MSA <- c("rural", "city")

# Gather on all columns except Year
df_msa_long <- pivot_longer(df_msa, -MSA, names_to = "Year", values_to = "Count")

# Plot 
ggplot(df_msa_long, aes(x = Year, y = Count, group = MSA, color = MSA)) + 
    geom_line()
Modifier et exécuter le code