In this chapter, we cover the reasons you need to apply new techniques when data sets are larger than available RAM. We show that importing and exporting data using the base R functions can be slow and some easy ways to remedy this. Finally, we introduce the bigmemory package.
Now that you've got some experience using bigmemory, we're going to go through some simple data exploration and analysis techniques. In particular, we'll see how to create tables and implement the split-apply-combine approach.
We'll use the iotools package that can process both numeric and string data, and introduce the concept of chunk-wise processing.
In the previous chapters, we've introduced the housing data and shown how to compute with data that is about as big, or bigger than, the amount of RAM on a single machine. In this chapter, we'll go through a preliminary analysis of the data, comparing various trends over time.