Get startedGet started for free

Comparing read times of CSV and RDS files

One of the most common tasks we perform is reading in data from CSV files. However, for large CSV files this can be slow. One neat trick is to read in the data and save as an R binary file (rds) using saveRDS(). To read in the rds file, we use readRDS().

Note: Since rds is R's native format for storing single objects, you have not introduced any third-party dependencies that may change in the future.

To benchmark the two approaches, you can use system.time(). This function returns the time taken to evaluate any R expression. For example, to time how long it takes to calculate the square root of the numbers from one to ten million, you would write the following:

system.time(sqrt(1:1e7))

This exercise is part of the course

Writing Efficient R Code

View Course

Exercise instructions

The files "movies.csv" and "movies.rds" both contain identical data frames with information on 45,000 movies.

  • Using the system.time() function, how long does it take to read in the CSV file using read.csv("movies.csv").
  • Repeat for the rds file, "movies.rds" using readRDS().

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# How long does it take to read movies from CSV?
system.time(read.csv(___))

# How long does it take to read movies from RDS?
___
Edit and Run Code