Reading chunks in as a matrix
In this exercise, you'll write a scalable table()
function counting the number of urban and rural borrowers in the mortgage dataset using chunk.apply()
. By default, chunk.apply()
aggregates the processed data using the rbind()
function. This means that you can create a table from each of the chunks and then add up the rows of the resulting matrix to get the total counts for the table.
We have created a file connection fc
to the "mortgage-sample.csv"
file and read in the first line to get rid of the header.
This exercise is part of the course
Scalable Data Processing in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Define the function to apply to each chunk
make_table <- function(chunk) {
# Read each chunk as a matrix
x <- ___(chunk, type = "integer", sep = ",")
# Create a table of the number of borrowers (column 3) for each chunk
table(x[, 3])
}