IniziaInizia gratis

Comparing read times of CSV and RDS files

One of the most common tasks we perform is reading in data from CSV files. However, for large CSV files this can be slow. One neat trick is to read in the data and save as an R binary file (rds) using saveRDS(). To read in the rds file, we use readRDS().

Note: Since rds is R's native format for storing single objects, you have not introduced any third-party dependencies that may change in the future.

To benchmark the two approaches, you can use system.time(). This function returns the time taken to evaluate any R expression. For example, to time how long it takes to calculate the square root of the numbers from one to ten million, you would write the following:

system.time(sqrt(1:1e7))

Questo esercizio fa parte del corso

Writing Efficient R Code

Visualizza il corso

Istruzioni dell'esercizio

The files "movies.csv" and "movies.rds" both contain identical data frames with information on 45,000 movies.

  • Using the system.time() function, how long does it take to read in the CSV file using read.csv("movies.csv").
  • Repeat for the rds file, "movies.rds" using readRDS().

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# How long does it take to read movies from CSV?
system.time(read.csv(___))

# How long does it take to read movies from RDS?
___
Modifica ed esegui il codice